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Preface 


This little book began as a set of course notes for an unusual but very at- 
tractive freshman course in algebra for math majors. The course introduces 
students to the notions of rigorous mathematics in the familiar settings of the 
integers and polynomials. This is worthwhile because of the strong parallels 
between the two theories. Indeed, one can argue that it is these parallels 
that led to the theory of commutative algebra as a unifying force. 

The current book is an expanded version of those notes. Some material 
has been added, and many more exercises are included. Historical notes are 
given at the end of each chapter with references to a few sources for the 
material. 

These topics have the advantage of being somewhat familiar to a good 
high school graduate, yet harbour many interesting unforeseen results. The 
number of different proof techniques in the book makes this a good intro- 
duction to a wide variety of new ideas. In particular, special emphasis has 
been paid to the role of algorithms in mathematics. Due to the increased use 
of symbolic computing, and especially because of the availability of MAPLE 
here at Waterloo, it has been natural to investigate the theory behind many 
of these computations. It also provides an opportunity to have student work 
out problems with much larger numbers. Many other symbolic computation 
programs, such as MATHEMATICA, are equally good for use in this course. 

This course has been taught at the University of Waterloo for over thirty 
years. Until about a decade ago, roughly 800-1200 first year students in the 
mathematical sciences took a course using the textbook Classical Algebra by 
W.J. Gilbert, now in a revised edition co-authored by S.A. Vanstone. 
About 5% of these students took the ‘advanced’ version using these notes. 

These notes were used for a one semester course. We would cover much 
of the material in this book, but not all. In writing this book, it has seemed 
advisable to expand on certain connections beyond the scope of the course. 
It is hoped that this will provide greater flexibility for the instructor and 
additional reading for the interested student. 


ix 


x PREFACE 


Students entering university to study mathematics have probably en- 
countered prime numbers. Chances are great that they believe every integer 
factors uniquely into a product of primes, but have not seen a proof. This 
important fact, known as the Fundamental Theorem of Arithmetic, is of 
crucial importance in the theory of numbers. It is not easy to prove. More 
importantly, it is not intuitively obvious. Indeed, its significance is only re- 
alized with very large numbers beyond our real experience. The crucial fact 
that enables us to prove this with relative ease is the Euclidean Algorithm 
for finding greatest common divisors. Chapters [I] and [2] deal with these 
basic properties of the integers and modular arithmetic. After giving the 
proof of the Fundamental Theorem of Arithmetic, we show that, in fact, the 
proof technique applies in much greater generality. In Section [1.8] we define 
Euclidean Domains and prove that all such rings have unique factorization. 
Throughout the book, we see applications of this general theorem in a large 
variety of setting, such as the Gaussian integers and polynomial rings over 
a field. 

It is worth noting that there are number systems not very much different 
from the integers in which unique factorization into primes fails. Far from 
being a disaster, this is an opportunity to investigate why this phenomena 
occurs. It shows us which properties of the integers themselves are crucial 
to make the theory work. That is why we make a foray into quadratic 
number domains in Chapter[3} Already the material covered in Chapters[I}+ 
Jallow us to prove Quadratic Reciprocity, one of the crowning achievements 
of elementary number theory. 

A nice application of modular arithmetic is the Rivest-Shamir-Adelman 
public key cryptography scheme. This code, which is covered in Chapter [4] 
allows the author to publish the method of encoding a message in a public 
place, while keeping the method of decoding the message secret. This is 
a rather different idea in coding, as for all previously known codes, the 
method of decoding merely reversed the encoding method. The secret here 
is that it is very easy (with a computer) to find large primes (say 200-300 
digits) but very difficult to factor the product of two large primes. When 
one first encounters the problem of determining if a given number is prime, 
it is natural to try the brute force method of dividing by all numbers up 
to the square root. However, it turns out there are beautiful and clever 
methods to test for primality without finding any factors at all. We delve 
more deeply into this subject, briefly discussing the Agrawal-Kayal-Saxena 
algorithm and its connection to the topics we have seen thus far. We also 
discuss the probabilistic test due to Miller-Rabin. 

In Chapter [5] we introduce the complex numbers. There is a tacit as- 
sumption that the student is already reasonably familiar with the real num- 
bers from studying calculus. However, a section is devoted to a brief discus- 
sion of how the real numbers are developed. The main result of this chapter 
is the Fundamental Theorem of Algebra, which states that every complex 
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polynomial factors into a product of linear terms. We emphasize how an- 
alytic techniques play a key role in the proof of this cornerstone algebraic 
result. The proof we give is one of the simplest, and relies on the Extreme 
Value Theorem. We also develop the complex exponential function, which 
plays a vital role in applications of the complex numbers. 

In Chapter [6] we show that the same theory developed for the integers 
applies to the algebra of polynomials. In particular, there is a Euclidean 
Algorithm and unique factorization into irreducible polynomials. We exam- 
ine various tests for irreducibility, and study connections with irrationality 
of the roots. We then follow up with special topics about real and complex 
polynomials such as Sturm’s Theorem for counting real roots, and the for- 
mula for solving cubics. In Chapter [7] we study finite fields in some detail. 
We draw parallels between modular arithmetic for the integers and arith- 
metic modulo an irreducible polynomial. Many of the results we have seen 
for Zp in earlier chapters carry over to all finite fields. A rather beautiful 
application of these ideas is an algorithm for factoring polynomials over the 
rationals. This algorithm is based on a method for factoring polynomials 
modulo a prime integer p. It turns out that factoring a polynomial of degree 
d mod p is much easier than factoring a d digit base p number. 

We would like to take this opportunity to thank the people who have 
helped with this endeavour. In particular, the first author thanks Stanley 
Burris with whom he has had many enjoyable conversations about this ma- 
terial. The first author also thanks Keith Geddes for some conversations 
on the algorithms used by MAPLE. The second author would like to thank 
David Jao and Stephen New for answering questions about the practical as- 
pects of RSA. It is a pleasure to thank Anton Mosunov for a careful reading 
of an early draft of the new version of this book and for sending us detailed 
comments and corrections. We thank the referees and editors at AMS/MAA 
for their helpful comments. Lastly, we thank the many students in Math 
145 classes who suffered through various versions of these notes and offered 
many helpful suggestions and corrections. 


Kenneth R. Davidson 
Matthew Satriano 
Waterloo, January, 2023 


Chapter 1 


The Integers 


The basic object which we shall study in the first four chapters is the set 
of integers. As a mathematical object, the integers have a wealth of struc- 
ture. First, you can add, substract and multiply integers together. It is 
the multiplicative structure which is of most interest, because the recipro- 
cal operation of division is not always defined (within the integers). The 
notion of divisibility leads to the definition of prime numbers, and then to 
the factorization of numbers into primes. The reader may well have been 
told that every number factors into primes in a unique way. This non-trivial 
result is known as the Fundamental Theorem of Arithmetic. It is far from 
obvious. We will prove it in this chapter. In the last section, we will show 
that essentially the same argument will work in a very abstract context. The 
advantage of doing this is that we will later see several explicit, important 
contexts to which it applies, such as the ring of all polynomials. 


1.1. Basic Properties 


The integers is the set 
Bed pH 10 1 a 


Beyond being a set, Z comes with the operations of addition and multipli- 
cation. Addition has an inverse operation called subtraction. However the 
inverse operation of multiplication, namely division, does not always yield 
an integer answer, which leads to the notion of divisibility. Describing the 
integers takes a little time, but the following list of properties is natural. 


[S1] The integers consist of a set Z together with two binary opera- 
tions addition (+) and multiplication (-). 
[Al] (commutativity of addition) For all a,b € Z, 
a+b=b+a. 
[A2] (associativity of addition) For all a,b,c € Z, 
(a+ b)+c=a+(bt+c). 


1 
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[A3] (additive identity or zero) There is an element 0 € Z so that 
for alla € Z, 

at+0=a=0O0+a. 
[A4] (additive inverses) For each a € Z, there is an element —a € Z 


such that 
a+(—a) =0. 
M1] (commutativity of multiplication) For all a,b € Z, 
a:-b=b-a. 


M2] (associativity of multiplication) For all a,b,c € Z, 
(a-b)-c=a-(b-c). 


M3] (multiplicative identity or one) There is an element 1 € Z so 
that for all a € Z, 


a-l=a=1-a. 
[D1] (distributive law) For all a,b,c € Z, 
(a+b)-c=a-ct+b-e. 


We did not define subtraction—it is enough to include the additive in- 
verse. That is because a — b is just an abbreviation for a + (—)). 

This is certainly a list of properties satisfied by the integers. But this 
collection of properties is satisfied by many other mathematical sets. For 
example, the collection R of all real numbers and Q, the set of rational 
numbers (fractions). Also, the set 


Z[V3] = {a+ bV3 : a,b € Z}, 
with the usual operations satisfies all these properties. Consider the set 
Z@Z= {(a,b):a,b€ Z} 
with coordinate-wise addition and multiplication, i.e., 
(a,b) + (c,d) =(at+c,b+d) and (a,b)- (c,d) = (ac, bd). 


This also satisfies these laws. What are the zero and one in this case? 

In fact a great many mathematical objects satisfy these laws. They are 
called commutative rings. The word ring is used for a set satisfying all 
these laws except Ml-commutativity of multiplication, and with another 
distributive law added: 


[D2] For all a,b,c € Z, 
a-(b+c)=a-b+a-c 
The set of 2 x 2 matrices with integer entries is an example of a non- 
commutative ring. Addition is coordinate-wise, but multiplication is defined 
by the rule 


a b\|w ax} jaw+by ax+bz 
c dl ly z| Jjew+dy ca+dz}" 
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1.1.1. Example. Another important example that will play an important 
role in this book is the set of integers modulo n. For now, consider the ring 
Zz consisting of two elements {0,1} with operations given by the tables: 


+/0 1 -|0 1 
0;0 1 0/0 0 
1/1 0 1)/0 1 


Notice that in this example, unlike the others, 1+ 1 = 0. This may seem 
rather strange, but it gives us a clue about how to add further properties to 
the above list to ensure that the integers are the only example. 


One property that will ensure we do not get too big a set is a stipulation 
that 


[G1] Z is generated by {0,1} as a ring. 


This means that we start with 0 and 1, and form all the elements needed 
to provide the minimal collection satisfying all our properties. Since a ring 
is closed under the operations of addition and multiplication, we need all 
the numbers of the form 1, 1+ 1,1+1+4+1,1+1+1-+1,.... You should 
convince yourself that the distributive law ensures that this set is closed 
under multiplication, as well as addition. In order to satisfy [A4], additive 
inverses, we may have to add in —1, —(14+1),.... You should now convince 
yourself that this collection is rich enough to satisfy all the properties. This 
includes checking that (—1) + (—1) = —(1+1), etc. None of the necessary 
steps are hard, but it is very time-consuming to write them all out. 

Unfortunately, this still does not ensure that we have the integers. The 
example Zg above is also generated by its 0 and 1. We can eliminate this 
example by decreeing that 1, 1+1,1+1+1,1+1+1+1,...are all different. 
If this holds in any ring, the collection S = {1,1+1,1+141,1414+1+1,...} 
will be indistinguishable from the natural numbers N = {1,2,3,...} by 
any mathematical property. In fact, to ensure that they are all different, it 
is enough that none are 0. Why? Then it follows that —S does not intersect 
S (why?), and that R = SU {0}U —S is a ring which has all the same 
properties as Z. We will name this last property |F1] for free: 


[F1] No nontrivial sum of 1’s is equal to 0. 


We have not written down all the important properties of the integers. 
But at least, we have come up with a list of properties that distinguishes 
the integers from other similar objects. Before leaving this point, we will 
show how we can define another very useful property — order — using what 
we already have. Define order as follows: 


a<b if b-aeEN 
a=b if b-a=0 
a>b if b-—ae—N. 


This order satisfies some simple properties: 
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{O1] For all a, b, and cin Z witha < b,at+c<b+te. 
[02] For all a, b, and c in Z with a < b and c > 0, ac < be. 


We say that an integer n is positive if n > 0. Notice that by definition 
of the ordering on Z, an integer n is positive if and only ifn EN. 

What do we mean when we say that two mathematical objects are the 
same? or at least have exactly the same properties? Throughout mathe- 
matics, one is concerned about this issue. It is generally dealt with by con- 
sidering maps between sets that preserve the structure that one is studying. 
The following definition captures part of this for rings. 


1.1.2. Definition. If R and S are rings, a function y: R— S is called a 
ring isomorphism if 


y is a bijection (i.e., one-to-one and onto) such that 
y(0) = 0 and (1) = 1, 

y(ri +r) = v(ri) + v(ra) for all r1,r2 € R, 
y(rire) = y(r1)y(r2) for all r1,r2 € R. 


Say R and S' are isomorphic if there exists a ring isomorphism y: R > S. 


If R and S are isomorphic rings, then they are indistinguishable on the 
basis of their properties as rings. We consider them to be equivalent objects. 
See Exercise [7] The new ring is just the integers ‘in disguise’. 


Exercises 


Show that Z[,/3] is a commutative ring. 


Show that if [F'1] holds, then sums of different numbers of 1’s are all 
distinct, and their additive inverses are all distinct from sums of 1’s. 


Verify the properties of a commutative ring for Zo. 
Can an operation < be put on Ze satisfying [O1]? 
Describe explicitly the ring Z[\/5] generated by 1 and V5. 


(a) What are the additive and multiplicative identities for Z 6 Z? 
(b) Show that there are non-zero elements in Z@ Z which multiply to 0. 


Ce eS 


7. Consider the ring R = {2":n € Z} with addition © given by 2” 62” = 
2r+™ and multiplication © given by 2” © 2” = 2”. Show that this is a 
ring. Then show that the map taking 2” to its logarithm base 2 (namely 
n) is a ring isomorphism from R to Z. 


8. What other properties of the integers can you think of? Can these 
properties be deduced from [S1], [A1]—[A4], [M1]-[M3], and [D1]? 


1.2. WELL ORDERING PRINCIPLE i) 


1.2. Well Ordering Principle 


In this section, we will look at a ‘self evident’ principle. We shall see that it 
leads us to the principle of induction, a basic proof technique which we will 
introduce here. In mathematics, one does have to be careful about what we 
think is self-evident, as this is not as clear as the reader might think. This 
principle can be justified. 


1.2.1 Well Ordering Principle. Every non-empty subset of N has a 
least element. 


This is true for the following reason. If S' is a non-empty subset of N, 
then it contains an element s. Consider the finite list of integers 1, 2,3,...,s. 
The first integer in this list which belongs to S is the desired least element. 

We will use this principle to formalize certain arguments. First, let 
us consider induction. Induction is a method used to verify a long (often 
infinite) list of propositions. Call the propositions P(n) for n € N. That is, 
each P(n) is a mathematical statement which might be true or false. 


1.2.2 Principle of Induction. Suppose that proposition P(1) is true. 
Furthermore, suppose that if P(k) is true for1<k<n, then P(n) is true. 
Then P(n) is true for alln > 1. 


Proof. Let S be the set of all m such that P(n) is false. If S is empty, we 
have the desired conclusion. Otherwise, S is non-empty. In this case, the 
Well Ordering Principle tells us that S has a least element, say n. By the 
hypotheses, n # 1. Since n is the smallest integer in S, we see that P(k) 
is true for all 1 < k < n. By the induction hypothesis, P(n) is true. This 
contradicts the fact that P(n) is false. The contradiction must be due to 
a false supposition — in this case, that must be the supposition that S is 
non-empty. So S is empty, and P(n) is true for all n > 1. a 


Sometimes this is called the generalized principle of induction because 
it assumes that all of the statements P(k) for 1 < k <n must be known to 
be true in order to deduce P(n), not just P(n — 1). This is sometimes an 
important improvement. See the Second Proof of Theorem in the next 
section. 


1.2.3. Example. We look for a formula for the sum of the first n squares: 
n 

Sn = >) i. The first few terms are 1,5,14,30,55,91. While no obvious 
i=1 


n(n + 1)(2n + 1) 


formula is apparent, the reader can check that P(n) : 5, = 


is valid for n = 1, 2,3, 4,5, 6. We will use induction to verify this for all n > 1. 
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First 
1(1+1)(2-14+1) 6 
ee Se eas, 
6 6 
Thus P(1) is true. In this example, to check P(n), it is enough to use the 
fact that P(n — 1) is true. Then 


(n — 1)(n)(2n — 1) 2 2n?—3n?+n+4+6n? 


$n = Sn-1 $m) = rn 6 
_ an 4+3n2? +n n(n+1)(2n4+1) 
7 6 7 6 


Thus if P(n — 1) is true, so is P(n). By induction, this formula holds for all 
n> 1. 


1.2.4. Example. Consider the following ‘proof’ by induction. We will 
show that all people have the same colour hair. Let P(n) be the statement 
that every set of n people all have the same hair colour. This is evident for 
n = 1. Now look at larger n. Suppose that P(n — 1) is true. Given a group 
of n people, apply the induction hypothesis to all but the last person in the 
group. This group have all the same hair colour. Now repeat this argument 
with all but the first person. We find that all the people have the same hair 
colour by combining these two facts. By induction, all people have the same 
hair colour. 

This is patently absurd, and you are undoubtedly ready to refute this 
by saying that Eric has different hair colour from Alana. But we want you 
to find the mistake in the induction argument. 


Exercises 


1. Prove by induction that 
n n 
2 
Sie = ( i) ; 
i=1 i=1 
2. Prove by induction that 
=, ——— 
2 


i=1 
3. Find the error in the induction argument in Example [1.2.4] 
Hint: P(1) is true, and P(73) implies P(74). 
4. Prove that n! > 2" > n? for n> 5. 


Prove that if x > —1 is areal number and n > 1, then (1+2)" > 1+ nz. 


1.3. 


6. 


10. 


PRIMES 7 


Let x > 1 be a real number such that 2+ 27! is an integer. Prove that 
x” +a~" is an integer for all n > 1. 
HInT: evaluate (x + 2~!)(2" 4+ 27-"). 


Consider the Fibonnaci sequence, given by F'(0) = F(1) = 1, and for 
n> 0, F(n+2) = F(n)+ F(n4+1). Let + = (V5 +1)/2. Prove by 
induction that 


F(n) = (r"** — (-1/r)"**)/V5. 
Define a sequence of real numbers by the rules 
s9=0 and Sp4,=V34+S,y for n> 0. 


(a) Show by induction that sp, < 8p41 <3 for all n > 0. 

(b) The least upper bound principle (see chapter [5) shows that the se- 
quence has a limit. Show that the limit should be 0 = 1+y13 

(c) Obtain a formula for o — s,+41 in terms of 0 — s,. Hence prove by 


induction that 0 < 0 — sy, < 3/4” for all n > 0. 


A real number x has a decimal expansion x = 79.%1%2%3... where x € 
Zand x; € {0,1,2,...,9} fori > 1. Say that this expansion is eventually 
periodic if there are positive integers d and N so that tn+q = Lp for all 
n > N. Prove that a real number with eventually periodic decimal 
expansion is rational. 

HINT: consider 10% +4¢z — 10% x. 


Let « = £ be a rational number, with q > 1. 

(a) Find rz € {0,1,...,q¢—1} so that 10" = axqg+ rz for k > 0. Show 
that there are two integers 0 < k < 1 < q such that ry, = 7; so 
q\(10' — 10"). Hint: the pigeonhole principle states that if q +1 
objects are placed in qg boxes, at least one box has two or more 
objects in it. 

(b) Show that if 0 < a < 107-1, then roi has a periodic decimal 
expansion. 

(c) Prove that x has an eventually periodic decimal expansion. 
HINT: consider (10! — 10*)z. 


1.3. Primes 


We have noted that division is not a part of the axioms for the integers. 
There are two good reasons for this. The first is that a/b is not defined as 
an integer for all pairs of integers a and b with b £ 0. Secondly, division 
is the inverse relation to multiplication in the same way that subtraction 
is the inverse of addition. Subtraction does not occur in the axioms either; 
but is shorthand for combining addition with the additive inverse. For these 
reasons, we define divisibility in terms of multiplication. 
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1.3.1. Definition. Say that an integer a divides an integer b if there is 
an integer c such that b = ac. The notation for this is a|b. 

An integer p is prime if p # +1 and the only integers which divide p 
are +1 and +p. 


We don’t consider +1 to be prime because they are invertible in Z; 
and are called units of Z. They divide every number, and this does not 
substantially change the factorization. Note that +2 and +3 are primes. So 


6 =2-3 = (-2)(-3) =3-2 = (-3)(-2). 


These are considered to be trivial differences because permutation of factors 
is irrelevant since multiplication is commutative, and —1 is a unit, so that 
we can put this as a factor into any term. Common practice is to factor 
positive integers into a product of positive primes in increasing order. 

The most important fact about factoring integers is that each integer 
can be written as a product of primes in exactly one way (up to signs and 
permutation of the factors). This is known as the Fundamental Theorem of 
Arithmetic. It is not particularly easy to prove. Indeed, it would be quite 
an accomplishment to do this properly without having seen a proof yourself. 
We will prove this theorem in this book, but it will take some preparation. 

In this section, we content ourselves with something easier—existence of 
a factorization into primes. 


1.3.2. Lemma. /fn = ab with a,b € N, thena <n. In particular, if 
bAI, thena<n. 


Proof. Since b € N, we have 1 < b. Thus, a < ab =n. Furthermore, if 
bA1, then 1 < band soa <ab=n. | 


1.3.3. Theorem. Every integer n > 1 is the product of a finite set of 
primes. 


Proof. Let S be the set of all integers n > 1 which are not the product of 
finitely many primes. We want to prove that S is empty. If it is not empty, 
then by applying the Well Ordering Principle, we obtain a least integer n 
which cannot be factored into primes. If n were prime, it would be the 
product of one prime, namely itself. So, nm cannot be prime and we may 
write n = ab where a and 0 are positive integers, neither of which is 1. By 
Lemma [1.3.2] we see a and 0 are less than n, so they cannot belong to S. 
Therefore both can be factored into primes, say 


QG=pipo...py and b=qiq...q. 


Then n can be factored as n = p po... pegiq2-.-.q-. This contradicts the 
fact that n is in S, and so S must be empty. | 
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SECOND PROOF. This is the same proof, but using induction rather 
than the Well Ordering Principle. Notice that we need the full generality of 
the Principle of Induction. 

Let P(n) for n > 2 be the statement that n factors as the product of 
primes. Check by hand that 2 is prime, and so P(2) holds. See Exercise 5] 
(This is our starting point, since there is no statement P(1).) Now suppose 
that P(k) holds for all k < n. If n is prime, then it is the product of one 
prime, so P(n) holds. On the other hand, if n is not prime, factor n = ab 
for a,b > 1. As above, 1 < a,b <n. By the induction hypothesis, P(a) and 
P(b) are true. (Here is where the full strength of the induction hypothesis 
is required.) Therefore, we can factor a and b into products of primes. As 
above, we can multiply them together to obtain a factorization of n into a 
product of primes. By induction, the theorem is true for all n > 2. | 


Why is this not enough for the Fundamental Theorem mentioned above? 
Because we do not know if the product of two different sets of primes can be 
the same! For small numbers, you know from experience that there is only 
one way to factor them. This property is known as Unique Factorization. 
But how many 1000 digit numbers have you tested? If the answer is more 
than one, what about numbers with 1010"° digits? We need an argument 
that goes beyond this common experience. The tool we need is the Euclidean 
algorithm, which we develop in a later section. 


Exercises 


Let a, b,c,7r,s be integers. Show that if a|b and alc, then a|(br + cs). 
Show that if a|b and b|c, then alc. 


Show that if c and d are integers such that c\d and d|c, then d = +c. 


= oe ee 


Show that if 1 < a € N has no divisor p with 1 < p < Va, then a is 
prime. 


Use Lemma to show that 2 is prime. 


= 


Show that if a product of integers a = aja9---dy is even, then at least 
one of the factors a; must be even. 


7. Show that if a product of integers a = a,a2---ay is a multiple of 3, then 
at least one of the factors a; is a multiple of 3. 


8. Does your method of proof in the previous question give any insight into 
what happens when we replace 3 by 1049? 


9. (Sieve of Eratosthenes) Imagine that you have listed all of the inte- 
gers from 1 to 10000. Cross out 1. Now 2 is the first remaining number. 
Cross out every second number following 2, i.e., 4,6,8,.... Now 3 is 
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the next remaining number. Cross out every third number following 3, 
i.e., 6,9,12,.... Some numbers like 6 and 12 are crossed out more than 
once. Show that after you have crossed out all multiples of 97, what 
remains is a list of all primes less than 10000. 


1.4. Many Primes 


In this section, we will give two proofs that there are infinitely many primes. 
The first proof of this fact is credited to Euclid, and dates from about 200 
BCE (see Exercise 2). Our first proof is slightly easier. Our second proof is 
much harder, and you may skip it without loss. But it gives some indication 
that primes are quite plentiful, whereas with the first proof, primes could 
still be very rare. 

In fact, the famous Prime Number Theorem shows that the number z(n) 
of primes less than or equal to n is approximately n/ log(n) in the sense that 
re ae 

noo n/logn 

Because the log function grows very slowly, this means that primes are 
quite common. The prime number theorem was conjectured by Legendre 
and Gauss about 200 years ago. They used extensive tables of primes to 
test the conjecture, but were not able to prove it. Riemann introduced the 
famous Riemann zeta function, and established important relationships be- 
tween the properties of this function and the distribution of the primes. One 
of the most important outstanding mathematical problems, known as the 
“Riemann hypothesis”, asks about the location of the zeros of this function. 
In 1896, Hadamard and de la Valleé Poussin finally proved the prime num- 
ber theorem, independently of each other, by obtaining partial information 
about the zeros of the zeta function. 


1.4.1. Theorem. There are infinitely many primes. 


Proof. Let n > 1. By Theorem [1.3.3], there is a prime, say py, which 
divides n! +1. If pnp < n, then p,|n! as well, and hence it would divide 
(n! + 1) — n! = 1, which is absurd. Thus p, > n. Therefore the set of prime 
numbers is unbounded, and thus is infinite. | 


One way to gauge the density of the primes is the following result which 
says that the sum of the reciprocals of the primes diverges. For a quickly 
growing series like the powers of 2, the sum of the reciprocals converges 
quickly. For the set of perfect squares, one verifies that the sum of the 
reciprocals converges by the integral test from calculus. In fact, even for 
a series like nlog n(loglogn)?, the sum of the reciprocals converges. So 
prime numbers occur more frequently in some sense. Indeed, this gives 
some credence to the prime number theorem. 
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1.4.2. Theorem. 


1 
S- — diverges. 
Pp 


p prime 


Proof. Let us number the primes in increasing order as p, < po < .... 

Again the proof proceeds by obtaining a contradiction if 7; > < oo. In this 

case, the ‘tail’ of the series is small. So we can choose an integer & so that 
ih 1 


oe 

Fix the large integer N = 4**+!. We will count the set {1,2,...,N} in 

a different way. The first step is to count the numbers from 1 to N which 

have a big prime factor p; for i > k. There are at most N/p; numbers in 

this range which are multiples of p;. Adding this up over all i > k, we find 
that there are at most 


N N 
n= et < oy 
isk Pi 
numbers in {1,...,.N} which have any of these primes as a factor. (This is 


a rather crude estimate because any multiple of more than one large prime 
is counted more than once; and if p; > N, there are no multiples at all.) 
Now the remaining numbers all have the form 


Cae py ep, 
To count these, we factor out the biggest square possible. That is, we write 
a = b?c where 
apr spy ep, Sond. Ca pap py 

where if nj; = 2m; is even, then e; = 0 and if n; = 2m; + 1 is odd, then 
e; =1. There are at most JN ways of choosing b since 1 < b< /a< JN. 
Since there are only two choices for each e;, there are at most 2* ways of 
choosing c. So altogether, there are at most m = 2*\/N ways of obtaining 
numbers of this form in {1,..., N}. (This estimate is crude too, but uses a 
trick that makes it pretty good.) 

Combining these two estimates, we have counted all numbers from 1 to 
N at least once. So 


441 N<n+m< N/2+2*/N = 27Ft1 4 oRoRt — ght1, 


This is an absurd statement, contradicting our hypothesis that the recipro- 
cals of the primes converged. So the series must diverge. a 


Exercises 


1. Show that there are arbitrarily long strings of consecutive composite 
numbers. 
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2. (Euclid’s proof) Suppose that the list of primes is finite: p),p2,...,Dn- 
Consider a prime factor of pjpo---pn +1. Conclude that there are 
infinitely many primes. 


3. The Fermat numbers are the integers F,, = 2?" + 1. 
(a) Show that 2 +1 divides x?° — 1 for any positive integer s. Hence 
show that F;, divides F,,, — 2 for allm>n. 
(b) Show that the F;, have no common prime factors. Hence give another 
proof that there are infinitely many primes. 


4. Suppose that p1,...,pr is a list of distinct primes. Let N = p.po... pr 
and q@ = N/p;. Define M = )°;_, qi. Show that no p; can divide M. 
Conclude that there are infinitely many primes. 


5. Show that there are infinitely many primes of the form 4n + 3. 
Hint: for n > 4, show that n! — 1 has a prime factor p, of this form. 


6. Show that ifn > 1 and a” — 1 is prime, then a = 2 and n is prime. 
HInT: factor the polynomial «” — 1. 


7X This is an exercise to see that the prime number theorem is plausible. 
(a) Use the integral test to show that the following series converge: 


= t saa | 
2 sen? and a, 


(b) Show that a(n) > (en? infinitely often. 
HINT: show that the sum of the reciprocals of the primes between 
2*-! and 2* is at most 2!~*7(2*). Use this to estimate the sum of 
the reciprocals of all primes. 


1.5. Euclidean Algorithm 


Long division is an algorithm usually taught in elementary school that allows 
one to divide a (usually smaller) number into another (usually larger) one, 
and obtain an integer quotient and remainder. This is actually quite a strong 
result, as it is the key to establishing the Euclidean algorithm in the next 
section. Yet because it is familiar since childhood, we take it for granted. 
This is formalized as follows: 


1.5.1 Division Algorithm. Suppose a € N and b € Z. Then there are 
unique integers q andr such that 


b=aq+r and 0<r<a. 
Proof. We will apply the Well Ordering Principle to the set of all positive 
remainders to obtain the smallest one. Let 


S={s:s=b—aq>0, andqgéZ}. 
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First note that S' is non-empty. For if b > 0, take q = O and obtain b € S. 
And if b < 0, take g = 6 to obtain 


s=b-—ab=b)(1-a)>0. 


Let r be the least element of S (whose existence is guaranteed by the Well 
Ordering Principle), and let q be the integer so that r = b— aq. If r >a, 


s=b—(q+la=r—-a>0 


is a smaller element of S. Therefore 0 < r < a. 
It remains to verify uniqueness. Suppose that 


b=aqg+tr1=aqgatrg and O<r<a_ fori=1,2. 


Subtracting yields a(q — q2) = re —11. But —a < rg — 11 < a, so the only 
multiple of a in this range is 0. Hence rg = r1, and thus q, = q2. a 


1.5.2. Definition. The greatest common divisor of a pair of non- 
zero integers a and 6 is the largest number d, denoted gced(a,b), which di- 
vides both of them. Two integers a and 6 are called relatively prime if 
gcd(a, b) = 1. 


The notion of largest common divisor, in terms of the natural order on 
Z, is not directly compatible with divisibility. In other words, small num- 
bers need not divide big ones. So one cannot say, without some additional 
argument, that the largest common divisor of two integers is related to other 
divisors in any multiplicative way. In fact, the reason that all divisors of 
two numbers divide the largest common divisor is the basis for proving that 
factoring numbers into primes is unique. 

The theoretical and computational importance of the greatest common 
divisor lies in the fact that there is a simple algorithm for computing it, 
which, at the same time, reveals some of the deeper structure. This al- 
gorithm is known as the Euclidean algorithm. It is best seen through an 
example. But first, we describe the basic idea. 

Start with two positive integers a and b, and say that a > b. Divide b into 
a to obtain a remainder r; and quotient qj. From the division algorithm, 
we have 0 < ry < 0b, and ry; = a— qb. Now divide r into 6 to obtain a 
remainder rg. Notice that rg can be expressed in terms of b and rj, and 
hence in terms of a and b. Repeat this operation by now dividing rz into 
r 1, etc. Eventually, this process ends because the remainders are decreasing 
and must eventually reach zero. The last non-zero remainder will be the 
gcd(a, b). As we go along, we keep track of how to express all the remainders 
in terms of integer combinations of a and b. 


1.5.3. Example. Consider the algorithm for gcd(901, 636). Now 636 goes 
into 901 q, = 1 times with remainder r; = 265. So 265 = 901(1) + 636(—1); 
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we write s; = 1 and t; = —1. Next 265 goes into 636 gg = 2 times with 
remainder rg = 106. Therefore, 
106 = 636 — 265(2) 
= 636(1) — (901(1) + 636(—1)) (2) 
= 901(—2) + 636(3). 
We write sg = —2 and tg = 3. Repeating this procedure, we see that 106 
goes into 265 g3 = 2 times with remainder r3 = 53. And 
53 = 265 — 106(2) 
= (901(1) + 636(—1)) — (901(—2) + 636(3)) (2) 
= 901(5) + 636(—7). 


We set s3 = 5 and t3 = —7. Finally, 53 divides into 106 exactly 2 times with 
0 remainder. The following chart helps to keeping track of this information. 


r |q] s t 
901 1; 0 
636 0; 1 
265 | 1 1} -1 
106; 2] -2] 3 

53 | 2 5 | -7 

0 | 2)-12)}17 


Notice that one obtains the value of s and t in a given row by subtracting 
q times the row above from the row above that. 

Now you should notice that 53 divides 901 and 636. The reason this 
happens is explained recursively. First, the fact that the next remainder is 
0 means that 53 exactly divides 106. The equation 265 = 106(2) +53 shows 
that 53 divides 265. Next, one has 636 = (2)265 + 106, so that 53 divides 
636. Finally, since 901 = 636 + 265, it is also a multiple of 53. Thus 53 is a 
common divisor of 636 and 901. 

Next, suppose that d divides both 636 and 901. Then the equation 
53 = 901(5) — 636(7) implies that d divides 53. In particular, 53 must be 
the biggest divisor because all common divisors of 636 and 901 divide it. 


It seems worthwhile to try to set down the main ideas of the proof here 
in general. However, if you do not think that you already have the basic idea 
of how it goes, stop now and work out a couple of examples on your own. 
Then look over the example above again to see if the arguments make more 
sense. Experience shows that trying to understand the general argument 
before understanding the concrete example is often futile. 


1.5.4 Euclidean Algorithm. Given two positive integers a > b, use 
the division algorithm repeatedly to obtain a sequence of remainders r; for 
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1<i<k-+1 until the last remainder rp4, = 0. Then gcd(a,b) = rp, and 
there are integers s andt so that gcd(a,b) = as+bt. Moreover, every divisor 
of both a and b divides gcd(a, b). 


Proof. For convenience of notation, we will write r_y = a and ro = Db. 
Notice that a and b are combinations of themselves. That is, a = a(1)+0(0) 
and b = a(0) + b(1). So we define s_; = 1, t_1 = 0, 89 = 0, and tp = 1. 
Now we proceed with our algorithm by induction. At each stage, we have 
each remainder r; = as; + bt;. If r; 4 0, divide it into r;_1 to obtain 
Y.—-1 = TiGi41 + 7Ti41 With remainder 0 < rj41 < r;. We have the equation 


Ti4. = Ti-1 — TiGi41 
= (asj4 + bti_1) = (as; =f bti) qi 
= a(si-1 = Sifi+1) + lis = tiqi41)- 


This writes rj,1, in the form as;,1 + bt;41, and in fact yields the explicit 
expressions $41 = $;-1 — SiGi41 and tya1 = tj-1 — tigig1. Since r; is a 
strictly decreasing sequence of non-negative integers, this process eventually 
stops with a zero remainder rz,11. 

Now we work our way back up the list, proving that r;, divides all of the 
r;. To begin, rz divides itself; and the identity rp_1 = rpqpi1 shows that 
rp divides rz_1. Suppose that we have shown that rz, divides rj41 and 1;. 
The identity r;-1 = ridit1 +7Ti41 holds. Since r;, divides the right hand side 
of the equation, it must also divide r;_;. Continue this process until it is 
shown that r; divides both ro = b and r_; =a. 

Lastly, it must be shown that every divisor d of both a and b divides rz. 
Now rz = as, + bty. It is clear that d divides the right-hand side, hence d 
divides rz. Thus rz = gced(a, b). a 


Extending [1.5.4|slightly further, we can give an alternative characteriza- 
tion of the gcd: while the gcd is defined to be the greatest common divisor, it 
turns out that it is also the least positive solution to a certain Diophantine 
equation. A Diophantine equation in an equation with integer coefficients 
for which we seek only integer solutions. This will be explored in greater 
depth in Chapter [3] 


1.5.5. Corollary. Let a and b be positive integers. Then gcd(a,b) is the 
least positive integer d for which there exist x,y € Z with 


ax + by = d. 


Proof. Let d’ = gcd(a,b) and let d be the least positive integer for which 
the equation ax + by = d has integer solutions. The Euclidean Algorithm 
[1.5.4] shows that there exist x9, yo € Z such that axp + byo = d’. Therefore, 
d <d'. On the other hand, d’ | a and d' | b, so we see d’ divides ax + by = d, 
and hence d’ < d. Therefore, d! = d. | 
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Exercises 


1. Prove that the remainder on division by 9 is obtained by the “casting 
out nines” algorithm. The method is to add the decimal digits of the 
given number. If the total is more than 9, repeat the procedure until 
the sum is a single digit. Replace 9 by 0. This result is the remainder 
after dividing by 9. Explain. 


2. Find the gcd of each of the following pairs of numbers, and express it as 
an integer combination of these numbers. 
(a) 31463 and 9782. 
(b) 65778 and 52507. 
(c) 5564737 and 5574221. 
(d) 2452548 and 2943234. 


3. Define lcm(a, b) = ab/ gcd(a, b). 
(a) Show that lem(a, b) is a multiple of a and a multiple of b. 
(b) Show that if aln and b|n, then lem(a, b)|n. 


4. Prove the following formulae for integers a, b,d and k. 
(a) gcd(a,b + ka) = gcd(a, b). 
(b) ged(ka, kb) = |k| ged(a, b). 
(c) ged($, 5) = 1 when d = gcd(a, b). 


5. Write a computer program to implement the Euclidean algorithm. The 
input is a,b € N. The output should be gcd(a, b) together with s,t € Z 
so that gced(a, b) = as + bt. 


6. The last step of the Euclidean algorithm yields 0 = as,41 + bt,41. Show 
that sx41 = £b/d and ty41 = Fa/d, where d = gcd(a, b). 
HINT: use induction to show that s;t;_1 — $s; jt; = +1. 


7* Find all strictly increasing functions f : N > N such that f(2) = 2, and 
whenever gcd(m,n) = 1, then f(mn) = f(m)f(n). 


1.6. Factoring Integers 


In this section, we will prove the Fundamental Theorem of Arithmetic. 
This simply states that every number factors into primes in exactly one way. 
This is very important, and without the aid of the Euclidean algorithm, it 
would be very difficult to prove. In fact, we will see in section [3.3] that there 
are number systems which do not have this unique factorization property 
while others much like them do. So unique factorization is a special property 
which relies on important structural properties of the integers which are not 
immediately obvious. 

The key to the proof is the following lemma, which follows quickly from 
the tools we have now. Try to prove it without the Euclidean algorithm. 
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1.6.1. Lemma. /f gcd(a,b) = 1 and albc, then alc. 


Proof. From the Euclidean algorithm, we obtain integers s and ¢ such that 
as + bt = 1. Since albc, there is an integer d so that ad = bc. Thus, 
c= (as + bt)c = a(sc + dt). 


Therefore c is a multiple of a. | 


1.6.2. Corollary. Suppose that a prime p divides the product aya2... ag. 
Then there is an index j so that pla;. 


Proof. We proceed by induction on k. This is evident for k = 1. For k = 2, 
this will follow from the lemma. For gcd(p,a;) divides p, and thus is 1 or p. 
In the first case, the lemma yields p|az; while the latter yields play. 

Now suppose that we have verified the result for k — 1. By hypothesis, 


p|(@1.-.@p—1) Gr. 
Applying the result for k = 2, we obtain pla, or plai...ax_1. If it is this 
second case, the induction hypothesis provides the desired conclusion. MH 


Note that this is not the most basic type of induction. As well as needing 
the result for k — 1, we also need the k = 2 result. The following corollary 
is an immediate consequence of the one above, so no proof is needed. Make 
sure that you understand why this is the case. 


1.6.3. Corollary. If a prime p divides a*, then pla. 


The numbers +1 are units of Z, meaning that they are invertible ele- 
ments; namely 1-1 = 1 = (—1)(—1). Any factorization into primes can be 
modified by multiplying each prime by a unit, provided that the product of 
all of the units used is 1. By convention, we consider a unit to be a product 
of no primes. 


1.6.4 Fundamental Theorem of Arithmetic. Every non-zero inte- 
ger factors uniquely as a product of primes. More precisely, suppose n > 2 
is an integer, and two factorizations into products of positive primes 


are given. If the factors are arranged so that py < po < ... < py, and 
Md <qa<...<qs, thenr=s andp;j=q; forl<i<r. 


Proof. Let us prove this by induction on n. Let P(n) be the statement 
that n factors uniquely into a product of positive primes in increasing order. 
First suppose n = 2. We know that 2 is prime, and thus has a unique 
factorization 2 = 2 as a product of primes; hence P(2) holds. 
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Next suppose that the result holds for all 2 < m < n. Furthermore, 
there is no harm in assuming that we listed our two factorizations of n so 
that p; < q,. Since 

piln = qi -+ +s, 
Corollary [1.6.2] above implies that p; divides some q;. Since q; is prime, this 
means p; = qj. However p; < qi < qj = pi, so we see that pj = q. Let 
m = n/p ,. Then 
P2.--Dr = ™M™M = QQ... s- 

Ifm = 1, then p2...p, = 1 which implies that p2...p, is the empty product, 
i.e. r = 1; and similarly s = 1. Therefore, py) = m = q and the result is 
proven. If m > 1, then since m < n, by induction, r—1 = s—1 and p; = q; 
for 2<i<-r. Hence the result is also established for n. | 


Exercises 


1. Factor into primes the number 
n = (5564737) (5541307) = (5574221) (5531879). 
You may assume that n has no factors less than 50. 
2. Find gced(100!, 3!°). Why was this question not in the previous section? 
How many terminal zeros are there in the decimal number 250!. 


4. (a) Count the number of positive integer divisors of a number n with 
prime factorization n = p?q® where p and q are distinct primes. 
(b) Find a general formula for the number of divisors of p%q’. 


Show that gcd(a?, b?) = ged(a, b)?. 


6. A number is called perfect if it is equal to the sum of all of its proper 
positive integer divisors. For example, 6 = 1+2+3. Show that if p and 
2? — 1 are both prime, then 2?~!(2? — 1) is a perfect number. 


7. If you have a symbolic manipulation program, factor n given that it is 
the product of 


4609068862978065342371213044512378636389457901495069208081 
and 
4609068862978065342371213053881215673426353463259338798251 
and is also the product of 
4609068862978065342371213050758269994414054942671248930813 
and 
4609068862978065342371213047635324315401756422083159071287. 


HINT: factoring such large numbers is slow, but gcd’s are fast. 
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8. Let the set of all primes be listed in order as 7, po, p3,.... Suppose that 
fe =p ssp,” and m= pe .. pr, where the superscripts a; and b; may 
be 0. Find the formula for gced(n, m). 


9. Suppose that p and q are consecutive odd primes. Prove that p+ q has 
at least three prime factors (not necessarily distinct). 


10. Suppose that a, b,c,d € N and ged(a,b) = 1. Show that if ab = c“, then 
a and 0 are both dth powers. 


11. Suppose that a,b,c € N and gcd(a, b) = 1. Show that if c|ab, then there 
is a unique factorization c = c,c2 in N such that c,|a and c9\b. 


1.7. Irrational Numbers 


An irrational number is a real number which cannot be expressed as a 
quotient of two integers. This may seem to be unrelated to the subject just 
covered. But in fact, many of the proofs of irrationality depend on unique 
factorization. 

Let us look at the argument that \/3 is irrational. It is proved by assum- 
ing that /3 = %, Where a and 6 are integers, and obtaining a contradiction. 
We may suppose that gcd(a, b) = 1. Squaring and cross multiplying yields 


3b? = a?. 


So 3 divides a”, and hence by Corollary [6.3] 3 divides a. If we write a = 3c 
and substitute into our equation, we obtain 


3b? =9c? and hence 067 = 3c’. 


Repeating the argument, we see that 3 divides b. But then 3 divides 
gcd(a, b). This is absurd, and therefore V3 must be irrational. 

There is some controversy about who first proved the irrationality of 
certain numbers. It was the school of Pythagoras who first showed that /2 
was irrational. A number a is called square free if there is no integer b > 1 
such that b?|a. Plato credits his teacher Theodorus with the irrationality of 
the square roots of the square free numbers from 3 to 17. Scholars speculate 
that the reason for stopping at 17 is because the Fundamental Theorem of 
Arithmetic was not known. See pp. 50-51.) We will see in Proposition 
that these proofs hold in much more generality. Indeed, later in the 
section on polynomials, even stronger irrationality results can be obtained. 

Here is a generalization of this fact. 


1.7.1. Proposition. Suppose that n and k are positive integers such that 
¢/n is rational. Then </n is an integer. 
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Proof. Again let us write Yn = ¢ with gcd(a,b) = 1. Taking the k-th 
power and cross multiplying, we obtain 


nb® = a*. 
If b # 1, let p be any prime factor of b. Then p divides a®, and hence divides 
a. Therefore, p divides gcd(a,b). Since gcd(a,b) = 1, b cannot have any 
prime factors; that is, b= 1. Hence ‘/n = a is an integer. | 


Ad hoc methods can be used to prove that various algebraic expressions 
are irrational. (See the exercises) Later in this book, there will be more 
sophisticated ways of proving irrationality. For other important numbers 
such as 7 and e, one needs an analytic expression that defines these numbers 
in order to prove irrationality. It is much more difficult to show that these 
numbers do not satisfy any algebraic equation at all. Such numbers are 
called transcendental. It is possible to give an elementary proof of the 
irrationality of e. In chapter [6] we will give a much more devious proof that 
e is indeed transcendental. 


1.7.2. Proposition. e is irrational. 


Proof. We need an expression for e. A useful expression from calculus is 


Suppose that e = a/k where a and k are positive integers. Compute 


Sk! k! 
a(k—1)!=kle= S0— + > a 
nl Nl 


n>k+1 
The first sum on the RHS is an integer. Hence there is an integer 
k 


k) 1 1 1 
avon) deni ~ Ett GEDETD) “er lees) 


Estimate the size of this ‘integer’, say b, by summing a geometric series: 


~ a k+1)} 1 
0<b< S (k+1) =e =a 


m=1 


There are no integers in this range, and so we have a contradiction. Hence 
e must be irrational. | 
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Exercises 


1. Show that 8 := /2+ V3 is irrational. HINT: if 6 = : with gced(p, q) = 1, 
do algebraic manipulations to eliminate the square roots, and deduce 
that q\p. 


2. Show that y = /2+ W5 is irrational. 
HIntT: if y = A with gced(p,q) = 1, get rid of the cube root first; then 
eliminate the square root. 


3. Show that logy, 7 is irrational. 


A. Let ay € {1,2,...,9} for n > 1. Show that >> aw is irrational. 
n>1 i 
5. Let ao be a root of a polynomial p(x) = 2” + c,_ya™ 1 4+---+a¢+ 09, 
where c; € Z and co 0. Show that a is either an integer or is irrational. 
Hint: if a = § with gced(a,b) = 1, compute b”p(a) in two ways, and 
deduce that bja”. 


6. Find a monic polynomial with integer coefficients with /2 + V3 as a 
root. 


7. Find a monic polynomial with integer coefficients with /2+ W/5 asa 
root. 


8. Show that if k is not a power of any other integer, then log, a is either 
an integer or irrational for each positive integer a. 


9. In this exercise we show there exist irrational numbers g and r such that 
q’ is rational. Prove that one may take r = /2 with either q= J/2 or 


q= V2". 


1.8. Unique Factorization in More General Rings 


This section has a much greater level of abstraction than the rest of this 
chapter. It could be put off until a later point. However since the proof 
is fresh in our minds, it makes sense to do it here. Otherwise we will find 
ourselves providing the same proof repeatedly in various contexts. 

Having now proved the Fundamental Theorem of Arithmetic it 
is worthwhile to figure out the level of generality in which our proof is 
valid. You will notice that the Fundamental Theorem of Arithmetic relied 
on Euclid’s algorithm [1.5.4) which in turn relied on the Division algorithm 
We will see that any ring where an appropriate analogue of the division 
algorithm holds will satisfy a type of Euclidean algorithm. This will then 
be used to prove a version of the Fundamental Theorem of Arithmetic for 
any such ring. 
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To begin, a basic property that Z enjoys is that two non-zero integers 
cannot multiply to be zero. We are interested in rings in general that satisfy 
this constraint. 


1.8.1. Definition. An element a of a ring R is a zero divisor if a 4 0 
and there exists a non-zero element b € R with ab = 0. A commutative ring 
R with no zero divisors is called an integral domain. 


As we show in the next result, integral domains satisfy a familiar can- 
cellation property. You will notice that this cancellation property for Z is 
used throughout the last few sections. So, in order to make the proof of 
the Fundamental Theorem of Arithmetic work in greater generality, it is 
important that we restrict attention to integral domains. 


1.8.2. Lemma. Let R be an integral domain. If a,b,c € R and ab = ac, 
thena=0 orb=c. 


Proof. We see a(b — c) = 0 and since R has no zero divisors, we must have 
a=Oorb=ec= 0. a 


Since our ultimate goal in this section is to prove an analogue of the 
Fundamental Theorem of Arithmetic, we need a suitable notion of units 
and prime numbers. In general rings, primes are called irreducibles. 


1.8.3. Definition. A unit of a ring R is an element x which has a 
multiplicative inverse y, i.e. there is an element y satisfying cy = yx = 1. 
We often write y = 2~'. The set of units of R is denoted by R*. 


1.8.4. Remark. In Exercise] you will prove that the y in Definition[L.8.3 
is uniquely determined. Thus, the notation 2~! is unambiguous. 


1.8.5. Example. In Z, the units are +1. In Q, every non-zero element is 
a unit. See Exercise 2] for some information on the units of Z[V2]. 


1.8.6. Definition. Let R be an integral domain. An element p € R is 
irreducible if p ¢ R* and whenever p = ab for a,b € R, either a or bisa 
unit. 


We next axiomatize what it means for a ring to have a division algorithm. 
The key property of the division algorithm [1.5.1] is that when we divide b 
into a, the absolute value of the remainder r is smaller than that of b. We 
will be interested in rings which, unlike Z, may not have a useful ordering. 
(See Exercise [9]) Thus, we cannot literally require in our division algorithm 
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that r < b. However, we can look for an auxiliary function f which measures 
“how big” an element is and we can require that f(r) < f(b). 


1.8.7. Definition. An integral domain R is a Euclidean domain if there 
is a Euclidean function f: R — No satisfying the following properties: 
(1) f(a) < f(ab) for all a,b € R with b #0. (order) 
(2) for all a,b € R with b ¥ 0, there exist g,r € R with 


a=bq+r 
and f(r) < f(b). (division) 


When we wish to emphasize the function f, we will say that (R,f) is a 
Euclidean domain. 


1.8.8. Lemma. Let (R, f) be a Euclidean domain. Then 
(1) ifae R\ {0}, then f(0) < f(1) < f(a). 
(2) ifa,be R\ {0}, then f(a) = f(ab) if and only if b € R*. 
(3) ifbe R\ {0}, then f(b) = f(1) tf and only if b is a unit. 


Proof. If a 4 0, then by the order property, 

f() < fa) = f(a). 
Now take a = b = 1 and use the division property to write 1 =1-q+r with 
f(r) < f(1). This must mean that r = 0 and f(0) < f(1). So (1) holds. 

If b € R*, then a = (ab)b~!, and so f(ab) < f(a) < f(ab); whence 
f(a) = f(ab). Conversely, suppose f(ab) = f(a). By the division property, 
there exist g,r € R such that a = (ab)q+r with f(r) < f(ab) = f(a). Hence 
r =a(1-— bq). If r £0, we would get f(a) < f(r) < f(a), a contradiction. 
So, we must have 0 = r = a(1 — bq). Since a 4 0, Lemma implies that 
1— bq =0. Thus b € R*. So (2) holds. 

The third statement now follows by taking a = 1 in (2). a 


1.8.9. Remarks. In Exercise [6] you will show that if R has a function f 
satisfying the division property, then R has a Euclidean function. 

In Exercise [8} you will show that if f is a Euclidean function and g : 
Ran f — No is strictly increasing, then go f is also a Euclidean function. 
Thus there are are many different choices for the Euclidean function; so 
this function is not unique. It means that we can always choose g so that 
g(f(0)) = 0 and g(f(1)) = 1. Thus we may suppose that f(0) = 0 and 
f(b) = 1 if and only if b is a unit. 


1.8.10. Example. The integers Z is a Euclidean domain, where we take 
f(n) := |n|. Notice that this particular choice of f has a lot of structure: for 
example, |ab| = |a||b|. Also if a | b, then |a| < |b| and we have equality if 
and only if a = +b. 
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We will see many other examples of Euclidean domains throughout the 
course, such as the Gaussian integers (Section [3.5), other quadratic number 
domains (see Section and Exercise [3] of Section [3.5), and polynomial 
rings over a field (Section [6.2). In this last example, the function f is the 
degree of the polynomial. 


1.8.11. Example. We will show that Z[V2] is a Euclidean domain for the 
function f(x) = |N(x)|, where N is the norm function defined in Exercise [I] 
That is, if 2 = 21+ 20/2 € Q[YQ], then N(x) = x? — 273. Exercise [I]shows 
that f is multiplicative. Since f maps Z[V2] into No, 
f(ab) = f(a) f(b) > f(a) for all a,b € Z[V2] \ {0}. 
Suppose that a = a, + agV2 and b = bi + bo V2 # (0 are given. Let 


a ataV2 bi —hvV2 


bby + boV2 by — boV/2 
_ ab; + 2agb2 | aide + aabi ig 


N(b) N(8) 
a1 + 22V2 € Q[v2]. 


That is, x, and x2 are rational. Choose integers c1, cz so that |v, — c| < 
and |a2 — c2| < 4. Define c = c, + coV2 € Z[V2]. Then let 


r=, trev2=a-— be = W(a —c) = b((a1 — c1) + (a2 — cz) V2). 


Note that r € Z[\/2]. However the norm is defined on Q[V2] and is multi- 
plicative by Exercise [I] It follows that 


N(r) = N(b)((21 c1)* — 2(x2 c2)”). 
Now (21 — c1)? € [0, 7] and (a2 — cz)? € [0, Z], so that 
((v1 — e1)° = 2(@2 — e)”) € [— 3, a]. 


Therefore f(r) = |N(r)| < $|N(b)| = $f(0). Thus Z[/2] has a division 
algorithm, and f is a Euclidean function. 


L 
2 


Our next result shows that Euclidean domains satisfy a type of Eu- 
clidean algorithm. Since R is not necessarily ordered, we cannot speak of 
the greatest common divisor of a and b. However, the properties of r;, listed 
in theorem below capture the fact that r, behaves like the gcd of a and 
b. Indeed, the first property says r;, is a common divisor of a and b; and 
the third property says that if e is any other common divisor, r; must be 
“oreater than” e in the sense that e divides r;. Notice that in the case when 
R=Z, this reduces to saying that rz = + gcd(a, b). 
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1.8.12 Euclidean Algorithm for Euclidean Domains. Let (R, f) 
be a Euclidean domain and a,b € R withb £0. Then using the division 
algorithm repeatedly yields a sequence 


a=bqa+r1 
b=r14¢. +72 
ry = 1Teq2 + 73 


Te-1 = Tkdk + 1 k+1 


with 71,12,...,;Tk non-zero, Tk41 = 0, and f(b) > f(ri) > --- > flre). 
Furthermore, ry, satisfies the following properties: 

(1) rp |a@ and rz |b, 

(2) there exist s,t © R such that as+ bt =rz. 

(3) for anyeé€ R, ife|a ande|b, thene| rz, 


Proof. For notational convenience, we let r_j = a and ro = b. Let us 
first show that the process terminates; i.e. there exists k with rpi1 = 0. 
Otherwise the process would define r; 4 0 for all 2 > 1. Consider the set 


Li tp) 2 FO; i> 1} 
with the r; defined as in the statement of the theorem. Since all f(r;) are 
positive integers, by the well-ordering principle, there must be a least ele- 
ment f(rz). If rg41 4 0, then we would have f(rxp41) < f(r~) contradicting 
the fact that f(rz) is minimal. Thus, rp41 = 0. 

We now prove (1). Since rg41 = 0, we have rp_1 = regy and so rg | rg-1. 
Now, inductively assume rz | rj41 and rz | Ti42. Since rj = Ti41git2 + Ti42, 
we see rz, | 7; as well. This proves that r;, divides all r;, in particular it 
divides r_j = a and rp = b. 

For (2), we prove by induction that there exist s;,t; € R with as;+ bt; = 
r;. For the base case of the induction, we have a = r_;-1+79-0 and 
b=r_1:0+79-1. We may therefore take s_; = 1, t_1 = 0, 59 = 0 and 
to = 1. Now assume that there exists s;-1, ti-1, 5;,t; € R with 


as;-1 + bt;-1 =ri-1 and as;+ bt; = 7;. 


We will show the existence of s;,¢; € R with as; + bt; = r;. By definition, 
we have 
itl = Ti-1 — Vidi 
= (asj_1 + btj_1) — (as; + bti)qi 
= a(si-1 — Gsi) + D(ti-1 — tigi), 
so we may take 5,41 = s;-1 — qs; and tj41 = t;_1 — tiq;. We have therefore 


shown that every r; is of the form as;+bt; for some s;,t; € R. In particular, 
the statement is true when j = k. 
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Now (3) follows from (2) since if as + bt = rz, then any common divisor 
of a and b must also divide ry. | 


1.8.13. Definition. Let a,b € R with R an integral domain. We say a 
and b are relatively prime if for every e € R, e| a and e|bimpliese € R*. 


1.8.14. Example. When R = Z, Definition [1.8.13] agrees with the usual 
notion of relative primality since Z* = {+1}. In Q, any two non-zero ele- 
ments are relatively prime. 


1.8.15. Corollary. Let (R, f) be a Euclidean domain. Then a,b € R are 
relatively prime if and only if there exist s,t © R such that as + bt = 1. 


Proof. Suppose as + bt = 1. If d| a and d| b, then d|1sode R*. This 
shows a and 0 are relatively prime. 

Conversely, applying Euclid’s algorithm [L.8.12] we see there exist s, t, rp 
in R with as + bt = rg, rz, | a and rz | b. Since a and 6 are relatively prime, 
ry, € R*. Hence, a(sr, +) + b(tr,') = 1. a 


We next show that every non-zero non-unit can be factored into a prod- 
uct of finitely many irreducibles. This gives an analogue of Theorem [1.3.3 


1.8.16. Proposition. Let (R,f) be a Euclidean domain. Then every 
non-zero non-unit a © R is a product of finitely many irreducible elements. 


Proof. We do induction on f(a). By Lemma[L.8.8] (I), f(a) > f(1) for all 
a #0. Let us begin with the base case of the induction, namely f(a) = f(1). 
By Lemma [L.8.8] (3), we have a € R* and so there is nothing to show. 
Next, fix a number n > 1 and assume that the statement is true for all 
be R with 1 < f(b) <n. Then we will prove the statement for all a with 
f(a) =n. Ifa is irreducible then we are done. So, we may assume a is not 
irreducible, in which case, by definition, we have a = be with b,c ¢ R*. Then 
Lemma[L.8.8] (2) shows f(b) < f(a) since c ¢ R*. Similarly, f(c) < f(a). By 
our inductive hypothesis, we know both 6 and c are products of finitely many 
irreducible elements. Since a = bc, we can multiply these two factorizations 
together to obtain a as a product of finitely many irreducible elements. I 


We next prove an analogue of Corollary [1.6.2] which was the key input 
to showing the Fundamental Theorem of Arithmetic. 


1.8.17. Proposition. Let R be a Euclidean domain and suppose p € R 
is irreducible. If p divides the product a,a2...ax, then there is an index j 
so that p | a;. 
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Proof. We proceed by induction on k. For k = 1, there is nothing to prove. 

Now let k = 2. First suppose that p and a, are relatively prime. By 
Corollary there exist s,t € R such that 1 = ps +ajt. Therefore 
a2 = pags +ayjagt. Since p | a,a2, we get p| ag. On the other hand, suppose 
p and a, are not relatively prime. Thus, there exists d ¢ R* such that d| p 
and d | a,. By the definition of an irreducible element, we see d = up where 
u € R*. Then up = d| aj, so p | a,. This completes the k = 2 case. 

We now consider k > 2. We have p | bax, where b = aja2...az_1. If 
p | Gx, we are done. Otherwise we may assume p does not divide ay. Then 
by the k = 2 case, we see p | aja2...a,_ 1. By induction, there exists 7 such 
that p | a;. a 


We now come to the main result of this section: in every Euclidean do- 
main, we can uniquely factor elements as a product of irreducible elements. 


1.8.18 Unique Factorization for Euclidean Domains. Let (R, f) 
be a Euclidean domain. Then every non-zero non-unit a € R can be written 
as a product of finitely many irreducible elements. Moreover, if 


QA=Pp1.--Pr=(41.--ds 


with all p;,q; irreducible, then r = s and after reordering the q’s, we have 
Gi = Uizp;, for some u; € R*. 


Proof. By Proposition [1.8.16] we know that every non-zero non-unit a can 
be factored into a product of finitely many irreducible elements. To prove 
the unique factorization statement, we proceed by induction on f(a). That 
is, we let P(n) be the statement that the conclusion of the theorem is valid 
for every a € R with f(a) =n. 

By Lemma [1.8.8] (I), we know f(a) > f(1) for all non-zero a. Let us 
begin with the base case of the induction, namely f(a) = f(1). By Lemma 
[1.8.8] (8), we have a € R*. Then if p,...p, = a, we see p; | 1, so p; € R* 
which contradicts the definition of an irreducible element. Therefore, a has 
no factorization into irreducible elements, and hence the statement P(f(1)) 
is vacuously true. 

Next, assume that f(a) =n > f(1) and P(k) is true forl1 <k <n. We 
will prove the statement for a. Assume that a = p,...pp = q,...Gs are two 
factorizations into irreducibles. Then 


Dy | O= Vis Dp — Oise Os 


By Proposition [1.8.17] p; | gj; for some j. After reordering the q’s, we may 
suppose that pi|qi. Since p; ¢ R*, using the definition of an irreducible 
element applied to q, we see gq; = u1p; for some u;, € R*. Therefore, 


a 
P23 --- Pr = — = 4593 --- As; 
Pl 
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where g5 = uig2. Directly from the definition, we have that gq) is also 
irreducible. Since 5° | a and p; ¢ R*, we have from Lemma[L8.8] (2) that 
Ta) < f(a). By the induction hypothesis, we conclude that r—1=s-—1 
(i.e. r = s) and after reordering the p;, we have gq = vp2 and q = uxp; for 
some v,u; € R* for 3<i<r. Thus q = (uy ‘v)pe; so we set ug = uy v € 
R*. This establishes P(n). By induction, the proof is complete. a 


Exercises 


1. Let Q[V2] = {r+s/2:7r,s €Q}. Forz=r+svV2 in Q[v9], define the 
norm of x be N(x) = r? — 287 € Q. 
(a) Show that r+ s/2 =t+uv2 for r,s,t,u € Q implies r = t and 
o=%: 
b) Show that if x,y € Q[V2] and y 4 0, then x/y € Q[V2]. 
c) Show that N(x) = 0 implies that 2 = 0 for x in Q[V2]. 
d) Show that N(xy) = N(a)N(y) for all x, y in Q[V2]. 


) 
a) Show that 1+ V2 and 17+ 12V2 are units in Z[V/2]. 
b) Prove that x € Z[V/2] is a unit if and only if N(«#) = +1. 
Hint: N(x) is an integer. 
(c) Prove that there are infinitely many units in Z[V2]. 
HINT: find a way to make other units from 1+ V2. 


3. (a) Show that 2 and 7 are not irreducible in Z[V2]. 
(b) Show that 2 = 5 — 2V2 is irreducible in Z[V2]. 
HINT: compute N(z). 
(c) Show that 3 is irreducible in Z[V2]. 
HINT: What are the possible remainders after dividing a square by 
8? Show that N(x) = +3 is impossible. 


4, Find a Euclidean function for Z[V3}. 
HINT: modify Example [1.8.11 


5. Let R be aring. Suppose that x € R and there are elements y1, yo € R 
such that yyx = 1 = rye. Prove that y; = yo; and so x is a unit. In 
particular, if x is a unit, then it has a unique inverse. 


( 
( 
( 
( 
( 


6. Let f be a function on a ring FR satisfying the division property of a 
Euclidean domain. Define g(a) = min{ f(ab) : b 4 0}. Prove that g isa 
Euclidean function for R. 


7. (a) Prove that (Q, f) is a Euclidean domain, where f(0) = 0 and f(a) = 
1 for0#AaEQ. 
(b) More generally, let F' be a commutative ring such that F* = F'~ {0}; 
such a ring F is called a field. Set f(0) = 0 and f(a) = 1 for all 
a#0. Prove that f is a Euclidean function for F’. 
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8. Let (R, f) be a Euclidean domain. Let g : Ran f > No be any strictly 
increasing function. Prove that go f is also a Euclidean function for R. 


9. (a) Show that Z[V/2] is ‘dense’ in R, meaning that if « < y in R, then 
there are integers a,b so that x < a+ bV2 < y. 
Hint: for each n € N, there is some an, € Z so that an+nvV2 € (0,1). 
Choose k so that i < y—x. Use the pigeonhole principle to find 
two numbers m <n so that 0 < |(@_, +nV2) — (am + mv2)| < Z. 
(b) Explain why the order on Z[,/2] induced from R cannot be used to 
define a Euclidean function. 


Notes on Chapter 1 


Presumably numbers arose from counting. Once civilizations developed 
some mode of writing, they also developed ways to record numbers. The 
ancient Egyptians had a system for writing numbers up to a million. The 
ancient Chinese had a base 10 system of numbers. Babylonians developed 
a system base 60. 

The notion of zero came later, first as a placeholder for writing numbers 
in base 10. For example, the Chinese just left a blank space for a zero in 
a base 10 number. The Babylonians first left it to context, but eventually 
adopted a symbol to indicate a blank space around 400 BCE. The Greeks 
however did not adopt the concept. The symbol zero apparently comes from 
India, possibly as early as 200 CE. It was brought back to Europe by the 
Arabs, who adopted it. Around 700 CE, Brahmagupta gave arithmetic rules 
for working with 0 as a number in its own right. This spread to China, with 
records from 1247 CE. Around this time, Fibonacci was proposing the use of 
0. It wasn’t until the 1600s that 0 came into more common usage in Europe. 

Negative numbers were not generally accepted in ancient times. There 
is a record of the use of negative numbers for solving equations in China 
around 100 BCE-50 CE. In Greece, in the third century, Diophantus made 
use of negative numbers as ‘a number to be subtracted’ for use in solving 
equations. However he apparently did not accept them as numbers on their 
own. In the 7th century, Brahmagupta used negative numbers to reduce 
the solution of a quadratic equation to a single case. (Diophantus had 
three cases.) Records from China show negative numbers in use by the 
13th century. In 1545, Cardano used negative numbers in his formulae 
for roots of cubics and quartics. In the 17th century, Descartes partially 
accepted negative numbers, although he considered them as false solutions 
to equations. In the 18th century, Euler discussed operations with positive 
and negative numbers. Yet still in the 19th century, Hamilton attempted 
‘to put negative numbers on a firm theoretical footing’. By this time, it was 
becoming more accepted—a surprisingly long time! 

Euclid wrote a 13 volume treatise on mathematics in 300 BCE. It con- 
tains the Euclidean algorithm and the proof of an infinitude of primes. It 
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also contains Lemma[I.6.1]and Corollary [1.6.2] As we saw, it is a small step 
from these results to the Fundamental theorem of arithmetic—but it does 
not appear in Euclid. The first precise statement of the FTA is by Gauss in 
1801. 

The Euclidean algorithm for the Gaussian integers was known to Gauss 
(see Section[3.5). Generally people only considered Euclidean algorithms for 
the norm function until 1950. The abstract notion of a Euclidean domain 
was implicit in work by Hasse in 1928. 

Hardy and Wright is a classic book on number theory that is still 
relevant today. It differs from many number theory books in that it often 
discusses different proofs, and it contains many historical notes. The 6th 
edition has updated notes that reference many more recent results. Riben- 
boim’s The little book of bigger primes is, as the title suggests, all 
about primes. There are many proofs of the infinitude of primes in Chapter 
1. Stark is a more modern number theory book whose introduction, in 
particular, is well worth reading by readers of our book. Silverman is 
another nice introduction to number theory. 

Alaca and Williams |2] is an algebraic number theory book which treats 
Euclidean domains in general. In particular, they give many results about 
the quadratic number domains Z[Vd] for d = 2,3 (mod 4) and Z[LY4] 
when d = 1 (mod 4). We explain at the end of Section why we use 
74) rather than Z[Vd] when d = 1 (mod 4). Stark [37] Section 8.4] 
also has interesting material about when quadratic number domains are 
Euclidean or UFD (unique factorization domains), which is a strictly larger 
class. Stark himself made important contributions to this problem. 


Chapter 2 


Modular Arithmetic 


In this chapter, we discuss computations ‘modulo n’, meaning that we only 
keep track of the remainder on division by n. We discuss solving systems of 
equations in several interesting contexts. 


2.1. Linear Equations 


In this section, we look for integer solutions of the simplest type of equations. 
An equation in which one searches for integer solutions is called a Diophan- 
tine equation, after the Greek mathematician Diophantus. Consider the 
equation 

ax + by =c 
where a, b and c are given integers. For example, 54+7y = 1 has the solution 
x =3and y = —2. But 6x +10y = 15 has no solutions because the left side 
is even, and 15 is odd. In general, ax+by is always divisible by d = gcd(a, b). 
Thus a necessary condition for a solution is 


gcd(a, b)|c. 


This is also sufficient. It follows from the Euclidean algorithm that 
there are integers s and t so that as + bt = d. So if c = dz, a solution of our 
equation is given by x = sz and y = tz. Therefore we have proved most of 
the following theorem. 


2.1.1. Theorem. The Diophantine equation ax + by = c has a solution 
if and only if d = gcd(a,b) divides c. Moreover, if {xo, yo} is one solution, 
then all solutions are given by 


p= atk y= ks for keEZ. 


Proof. The first part has been done. So suppose that {xo, yo} and {2, y} 
are solutions of ax + by =c. Then X = x — 29 and Y = y — yp satisfy 
aX + bY = (ax + by) — (azo + byo) = 0. 
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Hence aX = —bY. Dividing by d = gcd(a,b) yields §$X = —5Y. But, 
Xd 
gcd(4, 5) =1. Thus by Lemma[LG.ll $|Y and 5|X. Set k = —. So 


b 
b 
Bato + X= a9 +k. 
It follows that Y = —{X = —k§, and thus 
a 
y=yotY¥ =yo—k-. 
Conversely, it is clear that every pair {x,y} of this form is a solution. Hf 


Now we can handle more variables with a simple induction argument. 
If ay,...,@, are integers which are not all 0, then we denote the great- 
est common divisor of a set {a1,...,@n} by gced(a1,...,a@,). We define 
gcd(0,...,0) = 0. Like the Euclidean algorithm (1.5.4), Corollary 
gives a constructive method for finding solutions to the Diophantine equa- 
tion > aay Se 


2.1.2. Corollary. Let ai,...,an € Z. The Diophantine equation 


n 
S° C2=c 
i=1 
has a solution if and only if gcd(ay,...,an)|c. 


Proof. If a, =... = 4p =0, then there is a solution to $77", a;x; = c if and 
only if c= 0. We see c = 0 if and only if 0 | c, and since gcd(0,...,0) = 0, 
the corollary holds in this case. 

Hence for the remainder of the proof, we may assume some a; # 0, in 
which case d = gcd(aj,...,@n) is the greatest common divisor of the set 
{a1, cee ape 

The case n = 1 is trivial, and the n = 2 case is a consequence of Theorem 
Proceeding by induction, we suppose that the result holds for n = k—1 
(and n = 2). Consider the equation 


n 
) ajxj = C. 
i=1 


Since d divides the left-hand side of this equation, the condition d|c is nec- 
essary. 

Suppose that dic. Let b = gcd(a1,...,@n—1), and note that gcd(b, an) = 
d. By the n = 2 case, the equation by + anX%pj = c has a solution, say y = Y 
and x, = X,. Now using the n = k — 1 case, since b|bY, solve the equation 


n—-1 
oS Aki, = bY. 
i=1 
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Call this solution «; = X; fori =1,...,n—1. It is clear that X1,...,Xv» is 
a solution to our original equation. a 


2.1.3. Example. Consider the problem of measuring exactly 3 cups of 
water using two containers, one which holds 12 cups and one which holds 17 
cups, but neither has any markings for smaller units. This is really a matter 
of solving the equation 127% + 17y = 3. From the Euclidean algorithm, we 
get 5(17) — 7(12) = 1. (See the table.) 


ni/q/ s/t 
17 1) 0 
12 0} 1 
od} 1] 1)-1 
2),2/-2) 3 
1|2) 5|-7 


Hence, 3 = 15(17) — 21(12) = 3(17) — 4(12). To implement this solution, 
fill the 17 cup container. Fill the 12 cup container from the 17 cupper. 
Dump out the 12 cup container and add the remaining 5 cups. Refill the 17 
cup container, and continue filling and emptying the 12 cup container. It 
takes another 7 cups to fill it. Empty the 12 cup container again, and add 
the remaining 10 cups. Fill the 17 cupper a third time. Two more cups fills 
the 12 cupper, leaving 15 cups in the 17 cup container. Pour out another 12 
cups, leaving the 17 cup container holding exactly 3 cups. In other words, 
we have filled the 17 cup container 3 times, and emptied out 4 lots of 12 
cups using the 12 cup container. This leaves 3(17) — 4(12) = 3 cups. 


Exercises 


1. Solve 615a + 243y = 21. 
Solve 2491a + 1113y = 212. 


3. Using a 16 cL measure and a 27 cL measure and (approximately) half a 
litre of milk in a jug, how can you measure out exactly 30 cL? What is 
the most efficient way? 


4. Find a solution of 30w + 42x” + 70y + 105z = 1. 


An experimental robot may move forward in small steps of 27cm and 
in large steps of 75cm. It cannot turn or move backwards. It is at the 
beginning of a track of length exactly 10m. How does the robot get as 
close as possible to the other end of the track? 


6. A revised version of the robot above is able to move backwards as well 
as forwards the same distances. How much better can it do on a short 
track of length 1m than the earlier model robot? 
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2.2. Congruences 


A rather useful notion in number theory is that of modular arithmetic, 
which means, working only with the remainders after division by some fixed 
integer. For example, working modulo 2, a number is either even or odd. 
To determine the parity of the sum of two numbers, one need only know 
the parity of the the two numbers, not their actual values. Similarly, their 
product will be even if either number is even, and odd only if both are odd. 
Assign the number 0 to all even numbers (as this is the remainder after 
dividing by 2), and assign the number 1 to all odd numbers. The ‘addition’ 
and ‘multiplication’ tables for these remainders is the one given in section 
for the ring Zo. 


Another familiar situation is clock arithmetic. If the time now is 7 
o’clock, then in 19 hours it will be 2 o’clock. This calculation amounts to 
adding 19 to 7, and then throwing away all multiples of 12 until the result 
lies in the range of 1 to 12. 

We will see that a similar situation holds for every positive integer n. 
We say that a is congruent to 6 modulo n provided that n divides a — b, 
and write 


a=b (mod n). 


For example, 


702 
-98743 


= 968352 (mod 100) 
= 57 (mod 16) 
but 

99998 # 22 (mod 3) 


For every integer a, the Division algorithm shows that there is exactly 
one number b in {0,1,...,2—1} so that a = b (mod n). For each remainder 
a, an integer a can be chosen so that a = a (mod n) called a representative 
of a. The important property to recognize is that addition and multiplica- 
tion of remainders does not depend on which representative is used. More 
precisely: 


2.2.1. Proposition. Let n be a positive integer. Suppose that a, = a 
(mod n) and b; = bz (mod n). Then, 


a1 +b; =a2+62 (mod n), 


and 


a,b; =agb2 (mod n). 
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Proof. The hypotheses say that n divides both a, — a2 and b; — bg. Adding 
shows that n divides 


(a1 — az) + (b1 — b2) = (a1 + b1) — (a2 + b2), 


which is to say, a1 + b; = a2 + be (mod n). 
For multiplication, consider the calculation 


ab, — agbo = (ay az)b; t a2(b; bg). 
Since a, — a2 and b, — be are multiples of n, this shows that a,b, — agb2 is a 


multiple of n. In other words, a1b; = agb2 (mod n). |_| 


For example, consider the problem of determining the last 2 digits of 
3111748. Since 311 = 11 (mod 100), it suffices to consider powers of 11. 
These powers are computed modulo 100 as 11,21,31,41,.... It is not neces- 
sary to compute 11°, for example, because 


11° = 21-11 =231=831 (mod 100). 
In particular, 111° = 1 (mod 100). Thus, 
311s = (119) 2) = 3 i- (aiad 1.00); 


Later on, we will derive computational tools that will make this exercise 
even easier. 


Exercises 


1. Compute the remainder modulo 7 of 2222°°°°. 


2. What are the possible squares modulo 4? Hence show that 1234567 is 
not the sum of two squares. 


3. Suppose that a1 = a2 (mod n) and bj = b2 (mod n). Show that 
a, — b} =a2—b2 (mod n). 
4. Suppose that a = b (mod n). If p(x) is a polynomial with integer coef- 


ficients, show that 


p(a) = p(®) (mod n). 
HINT: First prove this for the monomials x”. 


5. Letn = yan a,10’ where the a; are positive integers in {0,1,...,9}, 
i.e., when written in base-10 expansion, n has digits ag,..., ao. 
(a) Prove that 3 | n if and only if 3 | aa aj. 
b) Prove that 9 | n if and only if 9 | 5 ore 
c) Prove that 11 | n if and only if 11 | 3*{_)(—1)*ai. 
d) Give a criterion in terms of the digits a; for when 7 divides n. 
Hint: 7|1001. 


( 
( 
( 
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6. (Josephus Problem) Let n be a positive integer and write the numbers 
from 1 through n in a circle. Starting at 1, continue going around 
the circle removing every other number until only one number remains. 
Determine the values of n for which 1 is the last remaining number. For 
example, if n = 7, we start by crossing off 2, then 4, then 6, then 1, then 
5, then 3, so the last remaining number is 7. 


2.3. The Ring Z,, 


Proposition [2.2.1] allows us to define a ring called Z,,. The elements of the 
ring are [0], [1],...,|n — 1] corresponding to the remainders {0,...,n — 1}. 
Addition is defined by setting [a] + [b] to be the remainder [c] such that 
a+b=c (mod n). Similarly, multiplication is defined by setting [ab] to be 
the remainder [c] such that ab = c (mod n). 


2.3.1. Example. Here are addition and multiplication tables for Z4, and 


Zs: 
+|0 12 38 | 12 3 
0/0 12 3 0/0 0 0 0 
Za: 1) t. 23 0 | 2 2 3 
| 210-4 A BH 2 
S\3-0 1 2 S10 22 43 
1G. 12-3 4 io: 2. 2 a aA 
0/0 12 3 4 0/0 00 0 0 
Ze: 1/1 23 4 0 Co 2 2:3 4 
= 2/2 3 4 0 1 2\0 2 4 1 3 
fio 4% 1.3 S10 3° 1 2.9 
Aa Dae a 3 4/0 4321 


Alternatively, we can use all the integers to represent elements [k] of Zn 
with the rule that [j] = [k] if and only if 7 -k =0 (mod n). Then the rules 
for addition and multiplication become 


[i] + [A] = [9 + 4 Li[A] = [9A]. 


This appears easier, but it raises a new difficulty. Before, there was only one 
definition of addition and multiplication for each pair {[a], [b]}. Now there 
are many such definitions, one for each pair of integers which represent the 
same two elements. It is important that all these definitions agree. For 


example, consider [2] + [3] = [5] in Z7. Instead, one might have chosen 
representatives [16] instead of [2] and [—18] instead of [3]. For their sum, 
we get [16] + [—18] = [—2]. Since [—2] = [5] in Z7, these two definitions are 


the same. Proposition shows that we get the same result regardless of 
which representative is chosen. 
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Using these tables, we can painstakingly verify all the laws of a com- 
mutative ring. However, a bit of thought shows that Zs inherits all these 
properties from the integers. For example, consider the associative law for 
addition. For any elements [a], [6], [c] in Zs, 


[a] + ([8] + [e]) = [a] + [b+ ¢] = [a+ (0+ 0)] 
= [(a+ b) +c] = [a + 8] + [ce] = ([a] + [8)) + [e]. 


Proposition[2.2. shows that it did not matter which choice of representatives 
was made. So the formula is verified. Similarly, all the properties of a 
commutative ring can be verified. So we obtain: 


2.3.2. Proposition. Z, is a commutative ring. 


If you study the multiplication table for Z5 above, you will see that 
every non-zero element has an inverse; for example, [2] - [3] = [1]. (That is, 
2(3) =6 = 1 (mod 5).) This is a property which Zs has but the integers do 
not. A commutative ring in which every non-zero element has an inverse is 
called a field. These fields will play a very important role in algebra. Two 
well known fields are the rational numbers Q and the real numbers R. 

In Definition[L.8-1] we defined integral domains and zero divisors. Fields 
are examples of integral domains, but Z is an example of an integral domain 
which is not a field. The ring Zg provides an example of a ring which is 
not an integral domain since [2] and [3] are zero divisors; this is because 
[2] - [3] = [6] = [0] but [2] 4 [0] 4 [3]. 

In order to determine when Z, is a field, we need the following simple 
consequence of the Euclidean algorithm. 


2.3.3. Lemma. Suppose that a,b and n are integers with gcd(a,n) = 1. 
Then the equation 

ax =b (mod n) 
has exactly one integer solution modulo n. In other words, [a][x] = [b] has 
exactly one solution in Zn. 


Proof. Define a function f : Z, > Z, by f([{x]) = [az] for all [a] in Z,. 
First, let us verify that f is one-to-one. Suppose that [x] and [y] are elements 
of Z,,. Pick representatives x and y in Z for [a] and [y]. If f([z]) = f([y]), we 
can interpret this as saying az = ay (mod n). This is equivalent to saying 
that n divides a(x — y). By Lemma [L.6.1] n divides x — y. This of course 
means that «= y (mod n). So, [a] = [y]. 

The set Z,, has exactly n elements. The function f is one-to-one, and so 
takes each of these n elements to n distinct elements of Z,,. It follows that f 
is onto. Thus there is exactly one element [29] such that [b] = f([xo]) = [azo]. 
In other words, xo is the unique solution mod n of the congruence equation 
ax = b (mod n). a 
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2.3.4. Corollary. For integers a and n, there is an integer b so that 
ab = 1 (mod n) if and only if gcd(a,n) = 1. 


Proof. The ‘if’ direction is immediate from the lemma. On the other hand, 
if ged(a,n) = d > 1, then ab+kn is a multiple of d for every choice of b and 
k; and so can never equal 1. | 


The invertible elements of a ring are called units. The set Z* of all units 
of Zy, is called the group of units of Z,. Z* is closed under multiplication 
(i.e. if [a] and [b] are units, then [ab] is a unit). It has an identity [1], every 
element has an inverse, and multiplication is commutative and associative. 
An algebraic object with these properties is called an abelian group. (The 
word abelian is derived from the name Abel, who was an eminent algebraist. 
It means commutative.) 

This corollary shows that [a] is an invertible element, or unit of Zp, 
exactly when gcd(a,n) = 1. We record this as a separate result. 


2.3.5. Corollary. The units of Z, are Z* = {[a] : gcd(a,n) = 1}. 
Now we can show that Z,, is a field if and only if n is a prime. 


2.3.6. Theorem. [fp is a prime, then Z, is a field. On the other hand, 
ifn is composite, Zp, has zero divisors and hence is not an integral domain. 


Proof. Suppose p is prime. Then every non-zero element of Z, has an 
inverse by Corollary [2.3.5] Hence Z, is a field. 

Conversely, if n is composite, factor n = ab so that neither a nor 6 is tn. 
Then n does not divide either a or b. So they represent non-zero elements 
[a] and [6] in Z,, satisfying [a][b] = [0]. Therefore Z,, has zero divisors. Ml 


The final result of this section gives a bound on the number of roots of 
a polynomial in Z,. We prove this after a preliminary lemma. 


2.3.7. Lemma. Let ac€ Z and 
p(£) = OnZ" + Qn—12" 1 + +++ +012 + a9 


with ag,...,@n € Z. Then there is a polynomial q(x) and r € Z so that 
p(z) = («@ —a)q(x) +r. Moreover, r = p(a). 


Proof. We prove the existence of q(x) and r by induction on n. If n = 0, 
we may take q = 0 and r = ag. For n > 0, we achieve the result by “long 
division”. We have (x — a)anz"—! = anx” — aanz”! is a multiple of x — a. 
Subtracting this from p(x) leaves 


2 


pi(x) = (an—1 + aan)x” 1 + an_ga™ 7 + +--+ a,x + ap. 
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By our inductive hypothesis, we have a polynomial qi() in Z[{x] and r € Z 
such that p1(x) = (a — a)qi(x) + r. Then 

p(x) = pila) + (@ — a)ag2”™* = (@ — a)(q (2) + ane”*) +1, 


so we may take g(x) = q(x) + anz”"?. 


Since p(x) = (@ — a)g(x) + 7, 
substituting x = a yields r = p(a). | 


A polynomial p(x) = anx"+an_12""!+---+a,r+<a09 is monic if a, = 1. 


2.3.8. Corollary. If q(x) is a monic polynomial of degree d with integer 
coefficients, and p is a prime, then the congruence equation 


q(x) =0 (mod p) 


has at most d solutions modulo p. 


Proof. This will follow by induction on the degree d. For d = 1, this 
follows from Lemma [2.3.3] Assume that the result holds for all polynomials 
of degree less than d. If g(x) = 0 (mod p) has no solutions, the theorem 
holds trivially. So assume that a is a solution, By Lemma [2.3.7] we have 


q(x) = (% — a)qi(x) + g(a) = (% — a)qi(x) (mod p). 
If b#a (mod p) is any other solution, then 
0 = q(b) = (b—a)qi(b) (mod p). 


Since b—a £0 (mod p) and Z, has no zero divisors, it follows that qi(b) = 0 
(mod p). In other words, all roots of q other than a are roots of q. By 
the induction hypothesis, qi(z) = 0 (mod p) has at most d— 1 solutions. 
Therefore g(a) = 0 (mod p) has at most d solutions. a 


Exercises 


1. Write down the addition and multiplication tables for Zg. 
2. Solve the equation x? + 4x + 2 =0 (mod 7) by completing the square. 


3. Solve the equation x? + 2 +7 =0 (mod 13) by completing the square. 
In this case, it helps to add a linear polynomial which is congruent to 0 
modulo 18. 


4. Show by example that Corollary is false if p is not prime. 
5. Show by example that Corollary is false if gq is not monic[] 


6* Show that every finite integral domain is a field. 
HINT: modify the proof of Lemma|[2.3.3 


"We thank Anton Mosunov for suggesting this exercise. 
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2.4. Equivalence Relations 


In this section, we will discuss an important mathematical notion which was 
used implicitly in the last two sections. This topic could be skipped by those 
keen to get on with the number theory. However, it is a notion that will 
recur frequently in your mathematical studies. 


2.4.1. Definition. An equivalence relation on a set S$ is a relation ~ 
satisfying the three properties: 


(1) reflexivity: a ~a for allae S. 
(2) symmetry: a * b implies b x a for all a,be S. 
(3) transitivity: a + b and b&c imply ac for all a,b,c € S. 


2.4.2. Example. Let S be any set, and consider the equality relation. 
That is, a is related to } if and only if a = b. This is easily seen to be an 
equivalence relation. 


2.4.3. Example. Consider the relation on Z given by congruence modulo 
n. It is clear that the reflexivity property a = a (mod n) holds since n|0. 
Also, if a = b (mod n), then n|b — a. Thus, n|a — 6 and so b =a (mod n). 
This verifies symmetry. Finally, if a = b (mod n) and b= c (mod n), then 
n|b—a and n|c— b, so n|(c — 6) + (b-—a) =c—a. Thus, a=c (mod n). So 
the relation is also transitive. This is an equivalence relation. 


2.4.4. Example. Consider the relation < on R. Since a < a, we see < 
is reflexive. If a << banda # b, then b € a. So < is not symmetric. It is 
transitive, since a < b and b < c implies a < c. This is not an equivalence 
relation. 


2.4.5. Example. Consider a relation on Z given by n © m if n and m 
have the same sign, meaning +,—, or 0. Now, n has the same sign as itself. 
If n and m have the same sign, then m and n have the same sign. Finally, if 
n and m have the same sign, and m and & have the same sign, then n and 
k have the same sign. So, this is an equivalence relation. 


If = is an equivalence relation on a set S, then each element a of S 
belongs to the equivalence class [a] = {b € S|b = a}. Every element 
of S belongs to exactly one equivalence class. So S is partitioned into a 
disjoint union of these equivalence classes. Conversely, if S is partitioned 
into a disjoint union of sets Ey for a € A, then define a relation a ~ b if 
and only if a and b belong to the same set Ey. One can check that this is 
an equivalence relation. In fact, this is essentially what occurs in example 
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above. One denotes the set of equivalence classes by 
{{a]:a€S}=S/x 


Equivalence relations arise naturally in many mathematical situations. 
Often, as is the case for modular arithmetic, one wants to define some al- 
gebraic operation on the equivalence classes which is compatible with the 
corresponding operation on the original set. Consider congruence modulo n 
again. The equivalence class for an integer a is [a] = {a+kn|k € Z}. When 
addition is defined on these equivalence classes by 


[a] + [b] = [a + 4], 


it is important that we can choose any representative from each class and 
add them in order to determine the class of the sum. This is known as 
showing that the definition of addition is well defined. This is the content 
of Proposition [2.2.1] In other words, 


{a+jn|j€Z}+{b+kn|keZ}={a+b+tn|t eZ}. 
This same proposition shows that multiplication is well defined. In set terms, 
{a+jn|j7€Z}-{b+kn|k eZ} c {ab+tn|t eZ}. 
For contrast, consider defining addition in example2.4.5} Let us call the 


three equivalence classes [+], [—] and [0]. When we try to define [a] + [b] = 
[a + b], the sign of a+ 6 is ambiguous. For if a = 1 and b = —2, the sum 
is negative which suggests that [+] + [-] = [-]. But a = 2 and b= —1, 
then a + 6 > 0 which suggests [+] + [—] = [+]. Likewise, if a = 3 and 


b = —3, then a+b = 0 which suggests that [+] + [—] should be [0]. So it 
is not possible to define an addition on these equivalence classes which is 
compatible with addition on the integers. Such a definition only works for 
certain equivalence relations. For this reason, when one defines an operation 


on equivalence classes, it is very important to check that the definition is 
well defined. 


Exercises 


1. Which of the following relations are equivalence relations? If not, deter- 
mine which of the three properties do hold. 
(a) For all z,y € R, say « + y if x — y is rational. 
(b) For all a,b € Z, say a = b if ged (a,b) = 1. 
(c) For all continuous, positive functions f,g on R, say f = g if 


lim f(2)/9(z) = 1. 


(d) For all a,b € Z, say a b if 3\(a +). 
(e) For all a,b EN, say a = 5 if alb. 
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Say that two continuous functions on [0,1] are equivalent (f ~ g) pro- 
vided that f(0) = g(0) and f(1) = g(1). Show that addition is well 
defined on the equivalence classes. 


Put a relation on N by setting n = m if n/ gcd(n,m) and m/ ged(n, m) 

are both odd. 

(a) Show that this is an equivalence relation, and describe the equiva- 
lence classes. 

(b) Show that the multiplication [n][m] = [nm] is well defined. 

(c) Show that the addition [n] + [m] = [n + m] is not well defined. 


(Construction of the rational numbers) Put a relation on 

S =Z x (Z~ 0) given by (a,b) & (c,d) if ad = be. 

(a) Show that ~ is an equivalence relation and let Q = S/ %. 

(b) Show that multiplication [(a, b)][(c, d)] = [(ac, bd)] is well defined. 

(c) Show that addition [(a, b)] + [(c, d)] = [(ad + bc, bd)] is well defined. 

(d) Prove that Q is a field with the above addition and multiplication 
operations. 

(e) Prove that map 


y:Q>Q, — ¢((a,dJ) 


is an isomorphism. 


(Construction of fraction fields) Let R be any integral domain and 

put a relation on S = R x (R~\ 0) given by (a,b) & (c,d) if ad = bc. 

(a) Show that ~ is an equivalence relation and let Frac(R) = S/ %. 

(b) Show that multiplication [(a, b)][(c, d)] = [(ac, bd)] is well defined. 

(c) Show that addition [(a, b)] + [(c, d)] = [(ad + bc, bd)| is well defined. 

(d) Prove that Frac(R) is a field with the above addition and multiplica- 
tion operations. This is referred to as the fraction field (or quotient 
field) of R. 


2.5. Chinese Remainder Theorem 


In this section, we will study systems of linear congruences of a very special 


form. Problems of this type were studied in many ancient civilizations. A 
full solution was obtained first in China by Yih-hing in 717. It is thought 
to have been used as a method of representing numbers, and doing large 


computations. 


To illustrate the method, consider the following example. 


2.5.1. Example. Consider the system 


x = 3 (mod 4) 
x = 12 (mod 25) 
x = 1 (mod 3) 
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First, let us solve the first pair of equations. This requires integers x, y and 
z such that 
xe=3+4y = 124 25z. 
Hence, 4y — 25z = 9. By inspection, y = —4 and z = —1 is a solution. Since 
gcd(4, 25) = 1, the most general solution is 
y= —4+4 25m z=-1+4m. 


Hence « = 34+ 4(—4+ 25m) = —13+ 100m. Now combine this with the 
third equation « = 1+ 3n. This yields 


100m — 3n = 14. 
Since 100(1) — 3(33) = 1, there is a solution m = 14 and n = 14(33) = 462; 
hence, m = 14 — 4(3) = 2 and n = 462 — 4(100) = 62 is a solution. The 
most general solution is given by 
m=2+4 3k n = 62+ 100k, 


which gives x = 3(62 + 100k) + 1 = 187+ 300k. In other words, x = 187 
(mod 300). Notice that 300 = (4)(25)(3). 


Now we consider the problem in general. 


2.5.2. Lemma. Suppose that m and n are relatively prime positive inte- 
gers. Then the system of congruences 
x 
x 


a (modm) 
b (mod n) 


NS 


has a unique solution (mod mn 


Proof. An integer x is a solution if and only if there are integers y and z 
satisfying 
c=at+my=b+nz. 
Therefore, y and z form a solution of 
my —nz=b—a. 
By Theorem [2.1.1] this has a solution yo, 29, and the most general solution 
is 
y=yotnk z=zo9+mk. 
Substituting back in yields 
x=at+mytmnk =b+nz +mnk. 


It is readily apparent that such an x solves our system of equations, so we 
have found a complete solution. From the form of this solution, x is unique 
modulo mn. a 


Now we can prove the Chinese Remainder Theorem. 
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2.5.3 Chinese Remainder Theorem. Suppose that m,,...,mn are 
pairwise relatively prime positive integers (i.e. gcd(m;,mj;) = 1 fori Fj). 
Then the system of congruence equations 


x=a, (mod m) 


x=az (mod m2) 


L=an (mod m,) 
has a unique solution modulo mym2...™Mn.- 
Proof. The proof is an induction argument. The lemma did the n = 2 
case. Suppose that the result holds for all k < n, where n > 3. Consider the 
first n — 1 equations. By the induction hypothesis, this system has a unique 


solution 6 modulo mj, ...™mny_1. In other words, the solution of this system 
is the same as the solution of the equation 


x = b (mod m,...mp-1). 


So our original system has the same solutions as the system 


x = b (mod mj,...mn-1) 

x = ap, (mod mp) 
By the lemma, this has a unique solution (mod m,...my). a 
Exercises 
1. Show that ifm ,...,m, are not relatively prime, then the conclusion of 


the Chinese Remainder Theorem never holds. 


2. Solve the system of equations 


x = 2. (mod 7) 

z = 5 (mod 11) 

gz = 9 (mod 13). 
3. Solve the system of equations 

x = 9 (mod 27) 

x = 4 (mod 5) 

x = 7 (mod 16). 


4. Solve the equation 2? — x —1=0 (mod 385). 


For every positive integer n, find n consecutive integers none of which 
are square-free|? 


? This exercise was given on the 1955 Putnam competition. 
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2.6. Congruence Equations 


Solving equations with congruences often yields useful information about the 
solution in the integers. It is also of independent interest to solve equations 
in Z,. Lemma[2.3.3] is an example of this kind of result. We will start by 
giving a more general form of it. 


2.6.1. Theorem. The congruence equation 
ax =b (mod n) 


has a solution if and only if d = gcd(a,n) divides b. The solution is unique 
(mod n/d). 


Proof. Notice that az = b (mod n) if and only if there is an integer y such 
that az + ny = b. By Theorem this has a solution if and only if 
gcd(a,n)|b. In this case, let A = a/d, B = b/d and N = n/d. Dividing the 
Diophantine equation by d reduces the problem to solving Ax + Ny = B. 
This is equivalent to solving Ax = B (mod N). Since ged(A,N) = 1, 
Lemma [2.3.3] shows that the solution is unique (mod NV). | 


2.6.2. Example. Here is an example of a linear congruence equation with 
two variables: 
342 + 4y =3 (mod 47). 

It might appear that the left-hand side is even and the right-hand side is 
odd. But in fact the right-hand side is really 3+ 47k, which may be even if 
k is odd. Since gcd(4,47) = 1, one can write 1 as a combination of 4 and 
47. For example, 1 = 4(12) — 47. So, 4(12) = 1 (mod 47). If we multiply 
the original equation by 12, we obtain 


12(34)a + 12(4)y = 12(3) (mod 47). 


Since 12(34) = 12(—13) = —156 + 3(47) = —15 (mod 47), this can be 
rewritten as 


y =36+15x2 (mod 47). 
Thus there are 47 solutions (mod 47), one for each choice of x. 


2.6.3. Example. Now consider an equation of higher degree 
a? +1=0 (mod 65). 
With a little luck, you might notice that x = 8 is a solution. Following 
standard factorization techniques, you will be led to 
(a — 8)(2 +8) = 2? —64=27+1=0 (mod 65). 
If this were an exact equation over the integers or even the real numbers, 


you could conclude that « = +8 were the only solutions. However, in solving 
this (mod 65), we are actually working in Zg5. By Theorem [2.3.6] Ze5 is 
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not an integral domain. The fact that it has zero divisors means that just 
because the product of « — 8 and x + 8 is 0 does not mean that either of 
these terms need be zero. 

To deal with this problem, we use the Chinese Remainder Theorem but 
in reverse. The point is that the equation x? +1 = 0 (mod 65) has the same 
solutions as the system 

ag? 1 0 (mod 5) 
x2 +1 0 (mod 13) 
The advantage of this is that Zs; and Z13 are both fields. So 
x2 +1 (c —8)(x+8) = O (mod 5) 
x2+1 (x —8)(a+8) = O (mod 13) 
do have exactly the obvious solutions. This is because in a field (or even 
in an integral domain) the product of two numbers is 0 only if one of the 
factors is 0. Thus we obtain the system 
a +8 (mod 5) 
2 +8 (mod 13) 
This is really four sets of equations 


x = 8 (mod 5) x = -8 (mod 5) 
x = 8 (mod 13) x = -8 (mod 13) 
x = 8 (mod 5) x = -8 (mod 5) 
x = -8 (mod 13) x = 8 (mod 13) 


Each of these sets of equations has a unique solution (mod 65) due to the 
Chinese Remainder Theorem again. The first two sets have the solutions 
x = +8 (mod 65) that we are already aware of. The last two sets have the 
solutions x = +18 (mod 65). So two surprising solutions turned up. 


2.6.4. Example. Let us look at the problem of determining how many 
square roots of 1 there are modulo n. Working as above, we can factor n 
into a product of prime powers and solve a system of easier equations. Let 
us first solve the equation 
z?—1=0 (mod p?) 

where p is prime. Now, x? — 1 factors as (x — 1)(2 +1) so that = +1 
are roots. Can there be any other roots? If there are, then x — 1 and 
x +1 must both be divisible by some positive power of p. Hence p divides 
gcd(a—1,x+1), and thus divides (2+1)—(x—1) = 2. So when p is any odd 
prime, x? — 1 =0 (mod p®) has exactly two solutions, c = +1 (mod p*). 

We must consider p = 2 separately. Following our argument above, we 
see that it may be possible that 2/2 —1 and 2°|4+1. The gcd(a—1,2+1) 
is at least 2™™4%5} and divides 2. Thus min{a,b} < 1. The new solutions 
occur when min{a,b} = 1, namely a=1,b=d—1ora=d-—1,b=1. This 
yields solutions 


2=2%'+1 (mod 2%). 
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Hence 2? = 1 (mod 2%) if and only if ¢ = +1 (mod 2¢-!). Thus there are 
4 solutions modulo 2¢ if d > 3. By inspection, there is 1 solution modulo 2 
and 2 solutions modulo 4. 

To describe the number of solutions of z? = 1 (mod n), let us write the 
factorization of n as 


d d 
n= Jp ree ue 
where p,; are distinct odd primes and d; > 0 for i > 1, but dp = 0 is allowed. 
Let e = max{do — 1,0}. The problem reduces to solving the system 


g=+1 (mod 2°) 


a=+1 (mod pf) 


g=+1 (mod pit). 


For each 1 > 1 there are two choices modulo pe, and for i = 0, there are 
so = 1,2 or 4 choices modulo 2% depending on whether dp — 2 is negative, 
0 or positive. Altogether this yields s = 2’sq different systems of equations. 
By the Chinese Remainder Theorem, each system has a unique solution 
modulo n. So there are s square roots of 1 modulo n. 


Unlike the case of real numbers, where it is not hard to solve degree 2 
equations, solving quadratic equations in Z, is a subject with considerable 
depth. Indeed, if p and q are odd primes, there is a surprising relationship 
between whether x? = p (mod q) is solvable and whether x? = q (mod p) 
is solvable. Known as Quadratic Reciprocity, this is a cornerstone result in 
Elementary Number Theory; see Section 


It is worth pointing out that our example of solving x?—1 = 0 (mod 65) 
illustrates another interesting phenomenon. We see 


(x — 8)(x +8) = x2? —1= (x—18)(x+18) (mod 65). 


We have therefore obtained two different factorizations of z?—1 into “primes”J 
(i.e. irreducible polynomials). This shows the failure of unique factorization 
for polynomials with coefficients in Ze¢5. 


Exercises 

1. Find all solutions of 1713x = 871 (mod 2000). 

2. Solve 642 = 84 (mod 66) completely. 

3. Solve completely the equation 32 + 7y = 11 (mod 95). 
4. Solve x? = 8x (mod 437). 
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5. What are the cube roots of unity mod 91? In other words, solve the 
equation x? — 1 =0 (mod 91). 
Solve x? + 2% +2+1=0 (mod 91). 


7. Solve the congruence system 


(mod 82) 
(mod 82). 


2x2 + Sy 
7x + 13y 


on 


1 


2.7. Fermat’s Little Theorem 


The theorem to be proven in this section does not deserve the title ‘little’. 
Indeed, it is a very important fact. However, Fermat’s most famous non- 
theorem has so overshadowed all his other work that this lovely result is 
‘belittled’. 


2.7.1 Fermat’s Little Theorem. Let p be a prime, and let a be an 
integer which is not a multiple of p. Then 


a?-'=1 (mod p). 


Thus, n? = n (mod p) for every integer n. 


PROOF. Consider the function f mapping Z, into itself used in the proof 
of Lemma [2.3.3 


F([@]) = [aa]. 
Since gcd(a,p) = 1, this function is one-to-one and onto. We have f([0]) = 
[0]. So f gives a bijection of the non-zero elements of Zp. In other words, 
{[a], [2a],..., [Qo — 1)a]} is just the set {[1], [2],...,[p—1]} possibly in some 
other order. Hence 
a(2a)(3a) -+-((p — 1)a) = 1(2)(3)---(— 1) (mod p). 


Simplifying both sides, we obtain 
(p —1)!a?-! = (p—1)! (mod p). 


The element [(p— 1)!] is not zero (i.e. p does not divide (p— 1)!), and since 
Z,y is a field, we can cancel out the (p — 1)! on each side of the equation. 
(Alternately, use Theorem to justify the cancellation.) Thus, 


a?-'=1 (mod p). P 
This can be reformulated as a result about Zp. 


2.7.2. Corollary. Let p be a prime. If [a] is a non-zero element of Zp, 
then [a]?! = [1]. For all elements [n], one has [n]? = [n]. 
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2.7.3. Corollary. Let p be a prime. If [a] is a non-zero element of Zy, 
then 


This theorem has many uses. 


2.7.4. Example. One immediate use is in simplifying congruence equa- 
tions. Very high powers can be replaced by lower ones. Consider the equa- 
tion 


7°09 4 2947°43 — 1947482 4 1990901 4 820182 — 75a)! 4 342°3 60 =0 (mod 61). 


It is immediately clear that x = 0 (mod 61) is not a solution. For every 
other x, we have 2° = 1 (mod 61). So the equation reduces to 


1+ 292° — 19x? + 1992 + 8207 — 754 + 3422 -—60=0 (mod 61). 


This reduces to 
2x7 + 22% 4+24+2=0 (mod 61). 
After cancelling the 2 and pulling out the factor x + 1, this becomes 


(a +1)(27+1)=0 (mod 61). 


Trial and error finds the solutions 7 = +11. This means the cubic factors as 


et+a%?ta+1=(¢4+1)(e—11)(r+11) (mod 61). 


Since 61 is a prime, this is zero only if one of the three factors is zero. So 
the complete solution is « = 11, 50 or 60 (mod 61). 


The number (p— 1)! comes up in the proof of Fermat’s Little Theorem. 
It is an interesting fact that (p— 1)! (mod p) can be computed. 


2.7.5 Wilson’s Theorem. /f p is a prime, (p — 1)! = —1 (mod p). 


Proof. The result is trivial for p = 2. So without loss of generality, p is an 
odd prime. The idea is to evaluate the product [1][2] --- |p —1] by pairing off 
each element [a] with its inverse [a]~!. There is a slight problem because [a] 
might be its own inverse. This happens only if [a] is root of x? = [1], which 
factors as (a — [1])(a + [1]) = [0]. Since Z, is a field, the only solutions are 


Hence the non-zero elements pair off into (p — 3)/2 pairs of inverses 
{[a], [a]~'} and two singletons [1] and [—1]. Multiplying together all the non- 
zero elements of Z, results in a product of (p—3)/2 ones and [1][—1] = [—1]. 
That is, (p — 1)! = —1 (mod p). a 
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Exercises 


13 
1. Compute 2!”” (mod 13). 
2. Find all solutions of 
352° 4 994790 4. 514220 — 472717 4+ 232148 + 390147 


+ 244 4 3407? — 23074 +1202 +16=0 (mod 73). 


Solve 229 + x75 + ¢144+1=0 (mod 91). 


4. Suppose that p is a prime of the form p = 4n+1. Prove that +(2n)! are 
roots of the equation 7? + 1=0 (mod p). 


5. Let a> 1 be any positive integer, and let p and q be primes. Show that 
if q divides a? — 1, then g=1 (mod p). 


6. Use the previous exercise to test whether 2!3 — 1 and 2°’ — 1 are prime. 
This cuts down significantly on the number of prime divisors that need 
to be tested. 


7. Suppose that n is the product of k distinct primes p1,..., pz. Show that 


i n\Pi-1t 
ew =1 (mod n). 


i=1 


8X The Fermat numbers have the form B= 22’ 41. The first few, Fo = 
3, Ff, = 5, Fh = 17, F3 = 257, and Fy = 65537 are prime. However, 
Fs = 641(6700417), and p = 6700417 is prime. Let 


@ = 2935363331541925531. 
You may assume (correctly) that 
a=1 (mod FoF\ FoF3Fyp) and a=-—1 (mod 641). 
Show that 2'a +1 is never prime for k > 1. 
9* Define a function f defined on {(n,m) :n,m €N, n> 2} as follows: 
k=k(n,m) := (n—-1)!+1-—mn 


—2 
f(nm):=" 5 (IP 1] — (#? - 1) +2. 
Compute the range of f, 


2.8. Euler’s Theorem 


In this section, we generalize Fermat’s Little Theorem from primes to arbi- 
trary integers. The problem is to figure out what the right generalization 
is. In order for a4 = 1 (mod n), it is necessary that az = 1 (mod n) have 
a solution. By Theorem 22.6.1] this means that ged(a,n) = 1. It turns out 


2.8. EULER’S THEOREM dl 


that this is also sufficient for some power of a to be congruent to 1 modulo 
n. In terms of the ring Z,, this is just the condition that [a] has an inverse 
because a(a?—!) = 1. 


2.8.1. Definition. The Euler totient or phi function is the cardinality 
y(n) of Z*. That is, y(n) is the cardinality of 


{e271 Ses, ecd(an) = 1}: 
For example, y(12) = |{1,5, 7,11}| = 4. 


2.8.2. Example. If p is prime, it is clear that y(p) = p—1. More 
generally, if n = p¢, then gcd(a,n) 4 1 if and only if p|a. The multiples of 
p between 1 and n are given by p, 2p,3p,...,p%, ie. p-1,p-2,...,p- (pt). 
We see there are p¢—! such numbers, so y(p*) = p* — pt! = p*1(p — 1). 
We will obtain a formula for an arbitrary y(n) in the next section. 


You should notice that the proof of the following theorem is exactly the 
same as the proof of Fermat’s Little Theorem. 


2.8.3 Euler’s Theorem. /f gcd(a,n) = 1, then a? =1 (mod n). 


PRooF. Fix an integer a such that gcd(a,n) = 1. Consider the function 
on Z, given by f([z]) = [az]. By Lemma [2.3.3] f is one-to-one and onto. 
As we have noted, if [a] and [a] are units, then so is [az]. So, f maps Z* 
onto itself. Multiplying all the units together yields the equation 


I] @l= [] leel= tae TT te. 


[x]EZ* [xJEZ* [zJEZ* 


Since |] ,,)c7+ [2] is a unit, it can be cancelled off leaving 


Exercises 


1. If ged(a,561) = 1, show that a®° = 1 (mod 561). Calculate (561). 
2. Let n = pipop3 be the product of three distinct primes. Let 


d= Iem{p1 1, pe 1, ps iB 
Prove that if gcd(a,n) = 1, then a4 = 1 (mod n). Generalize. 
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3. (a) Suppose that n is the product of k distinct primes. Use the Chinese 
remainder theorem to show that ifm =1 (mod ¢y(n)), then a” =a 
(mod n) for all integers a. 
(b) Show by example that this is false for n = 49. 


4. Before reading the next section, compute a few examples such as y(30), 
(72), y(225) in order to conjecture a formula for y(n). 


5* Compute [Tisjezs [2]. 
HINT: Use the information about square roots of 1 in Z, to show that 
if n is odd with k distinct prime factors, then jezs [x] = [-1]*. Then 
find the general formula. 


2.9. More on Euler’s Phi Function 


First we obtain a formula for y(n). The key tool is the Chinese Remainder 
Theorem. 


2.9.1. Lemma. [f gcd(n,m) = 1, then p(nm) = v(n)y(m). 


ProoF. It is clear that gcd(x,nm) = 1 if and only if ged(a2,n) = 1 and 
gcd(xz,m) = 1. Let 
5, = {a2 l<a<n, god(a,n)=1} and $, ={b2 lb, ped(b,m)=1}. 


For each a € S, and b € S;, consider the system 


a (mod n) 
b (mod m) 


# 


By the Chinese Remainder Theorem, this has a unique solution (mod nm). 
Thus for each choice of a € S, and b € S,», we obtain one element in 
Snm- Conversely, if c € Sym, then a = x (mod n) belongs to S, and b= =z 
(mod m) belongs to S,,. Thus, 


e(nm) = |Snm| = |Sn||Sml = (nr) e(m). = 


2.9.2. Theorem. [fn = pu - . pik where p; are distinct primes, then 
= 1 1 
p(n) =n(b= 2) (l=). 


Proof. We prove the result by induction on k. When & = 1, the number 
n is of the form n = p* where p is prime. Then Example 2.8.2] shows 


(nr) = p**(p—1) =n(1— 5). 
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For k = 2, the Lemma applies directly to give 
(p7a°) = vp") (4°) 
peje 7) 
=n(1- 20-3) 


Suppose that the result is true for 7 < k, and consider n = pu ee pe. Let 
k-1 


m= pu . pits . By hypothesis, y(m) = m(1 - x) vee (1 _ aa) Since 


n= mpt and gcd(m, ptt = 1, the lemma applies to show that 
k k 


o(n) = p(m) p(t") 
= m(t= A) = peel 8) 
=n(l=>.)-U=_). 


Therefore the theorem follows by induction. | 


The following result is a very useful property of the Euler phi function. 


2.9.3. Theorem. 


s eld) =n. 


d\n 


Proof. Let Sg = {k:1<k <n, gcd(k,n) = d} for divisors d of n. Since 
the only possibilities for gcd(k,n) are divisors of n, it is clear that this 
provides a partition of {1,...,n} into disjoint sets. Notice that if k € Sa, 
then ged(k/d,n/d) = 1 and 1 < k/d < n/d. Conversely, if gcd(j,n/d) = 1 
and 1 < j < n/d, then k = jd belongs to Sg. Hence there is a bijection 
between Sq and the units of Zn/q. So |Sa| = p(n/d). Therefore 


n= S_|Sal = >— 0(4)- 
ae an 


Since 4 runs over all of the divisors of n when d does, the desired formula 
follows. a 


Exercises 


1. (a) Prove that if p is prime and n is divisible by p, then y(pn) = py(n). 
(b) Show that in general if m divides n, the quantities y(nm) and my(n) 
need not be equal. 


2. Prove that for every positive integer k, there are only finitely many n 
for which y(n) = k. 


3. Find all n with y(n) = 12. 
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4. (a) Prove there are infinitely many positive integers n with y(n) = 4. 
(b) Prove that there are also infinitely many positive integers n with 
y(n) = 3° 
5. Suppose that ged(n,m) = 1, and d|nm. Show that there is a unique 
factorization d = ab so that a|n and b|m. 


6* Verify Theorem directly for n = p*. Then use Exercise 5] to prove 
it for products of distinct prime powers. 


2.10. Primitive Roots 


In this section, we show that for every prime p, one may always find an 
integer a such that {1,a,a?,...,a?~'} is a permutation of {1,2,...,p—1} 
mod p. This is often useful, when one wishes to study problems that are 
multiplicative in nature, rather than additive. 
2.10.1. Definition. If a is an element of Z*, its order is the smallest 
positive integer d = ord,(a) such that a4 = 1 (mod n). Furthermore, say 
that a is a primitive root (mod 7) if the set of powers 


f{a* (mod n):1<k <d} 


coincides with the set of all of Z*. 


2.10.2. Proposition. If a? = a® = 1 (mod n), then d = gcd(b,c) satis- 
fies at = 1 (mod n) also. Hence a® =1 (mod n) if and only if ordy(a)|b. 


Proof. By the Euclidean algorithm, there are integers s and t so that d = 
bs + ct. Hence 
at = (a’)*(a°)' =1 (mod n). 


In particular, e = gcd(b, ordn(a)) satisfies a = 1 (mod n). Since ord,(a) 


is the smallest such integer, and e| ord;(a), we conclude that e = ord,(a). 
Hence ord,,(a)|b. a 


2.10.3. Corollary. If gcd(a,n) = 1, ord,(a)|y(n). 


Proof. By Euler’s theorem, a?) = 1 (mod n). Hence by Proposition 


ord, (a)|y(n). a 


The set of invertible elements Z*, of Z,, consists of the (equivalence classes 
of) elements relatively prime to n, and so has cardinality y(n). One sees that 
the powers of a belong to exactly ord,,(a) different classes (mod 7). For if 
a* = a! (mod n), with k < 1, then a!“* = 1 (mod n). Thus ord;,(a)|(1 — k), 
and so 1 > ord,(a). Conversely, if ord,(a)|(J — k), it follows that a* = a! 
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(mod n). So the distinct powers of a are precisely 
{a® (mod n):1<k <ord,(a)}. 


In particular, a is a primitive root of Z, exactly when ord,,(a) = y(n). So 
we obtain: 


2.10.4. Proposition. [f gcd(a,n) = 1 and ord;(a) = n —1, then n is 
prime. 


When n is composite, there is frequently no primitive root. For example, 
modulo 15, the elements {2,7,8,13} have order 4, {4, 11,14} have order 2, 
and 1 has order 1. Since Zj; has 8 elements, there is no primitive root. 
However, for a prime p, it will be shown that a primitive root always exists. 
For example, modulo 17, the elements {3,5,6,7, 10, 11, 12,14} are all prim- 
itive roots. The proof is based on a counting argument, and properties of 
the Euler phi function. 


2.10.5. Lemma. Let p be a prime. For each divisor d of p—1, let f(d) 
denote the number of elements of Z,, of order d. Then 


Y_ fle) =d 


eld 


for every divisor d of p—1. 


Proof. By Fermat’s Theorem, every element a € Z>, satisfies ge? = 1... In 
other words, the congruence equation x?~' — 1 = 0 (mod p) has exactly 
p—1solutions modulo p. For each divisor d of p—1, one has that ordp(a)|d 
if and only if a4 = 1 (mod p) (i.e. exactly when a is a root of 4 —1=0 
(mod p)). Thus the number of roots is }).i4 f(e). Also, one can factor 


x?! —1 = («—1)pa(x) (mod p) 


where 
pa(x) =1ltat+o744...4 9? 14 = S- a, 
O0<k<(p—1)/d 
By Corollary 2.3.8} paq(x) = 0 (mod p) has at most p—1—d distinct solutions 
modulo p, and x“ — 1 = 0 (mod p) has at most d solutions. But together, 
they have exactly p—1 distinct solutions. So both equations must have their 


full complement of solutions. In particular, 27 = 1 (mod p) had exactly d 
solutions modulo p. Therefore )/.), f(e) = d. a 


Notice that by Theorem [2.9.3] the Euler phi function satisfies exactly 
the same set of equations as the function f of the lemma. That is the key 
to this theorem. 
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2.10.6. Theorem. The function f of Lemma coincides with the 
Euler phi function on the divisors of p—1. In particular, the field Zp for 
p prime always has p(p — 1) primitive roots. Therefore there is an element 
ac Zi, so that the set of powers {ak >1<k< p-—1} coincides with the set 
{1,2,...,p—1} modulo p. 


Proof. We prove the result by induction on the size of the divisor d of p—1. 
For d = 1, there is, of course, exactly 1 solution of c = 1 (mod p). Thus 


f(l) =1= ¢(1). 


Suppose that f(e) = y(e) for all divisors e of p— 1 which are less than d. 
In particular, this is true for all divisors of d. Hence by the previous lemma 
and Theorem [2.9.3] , 


fd@=d- > fle)=d- > ve) =¢(d). 


eld, e<d eld, e<d 


Therefore the number of primitive roots is y(p— 1), which is non-zero. MH 


There are many interesting unsolved questions concerning primitive 
roots. For example, in 1927, Artin conjectured that if a € Z is not a perfect 
square and not —1, then there exist infinitely many primes p for which a is 
a primitive root in Zp. In particular, Artin’s conjecture would imply that 2 
is a primitive root in Z, for infinitely many primes p. Currently, there is no 
value of a for which Artin’s conjecture is known. In 1967, Hooley did 
however give a conditional proof of Artin’s conjecture assuming the gener- 
alized Riemann hypothesis. Unconditionally, Heath-Brown [17] proved in 
1986 that at least one of 2, 3, or 5 must be a primitive root in Zp for infinitely 
many primes p. 

Now let us return to the problem of proving that a number p is defi- 
nitely prime. By the previous discussion, it is sufficient to find some a with 
ord,(a) = p— 1. However, it defeats the purpose if we must compute all 
p—1 powers. This is not necessary if p— 1 can be factored. A method for 
factoring is described in the next section. It may be the case that p— 1 
has a lot of small factors. This will make factoring it substantially easier. 
The idea is this: factor p — 1 = [| a, then verify that a?—! = 1 (mod p) 
and compute a?—)/% (mod p). If any of these is 1, then a is not a prim- 
itive root. But if they are all different from 1, then all powers of a up to 
p —1 are different, and a is a primitive root. Moreover, this shows that 
ord,(a) = p—1, so p is definitely prime. To see this, suppose that a* = a’ 
(mod p) withl<k<é<p. Thenifm=—k,a” =1 (mod p). We also 
know that a?~! = 1 (mod p). Let d = gcd(m, p—1). By Proposition 2.10.2] 
a? = 1 (mod p). Clearly, d is a proper divisor of p—1. Thus d divides 
(p — 1)/q; for some i, and so a®—))/% = 1 (mod p). 
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2.10.7. Example. Consider the example p = 113. Factor p— 1 = 112 = 
247. By hand, compute mod 113 


Of: ee POR Se. 1G 
OY = 005) SA 
2s = 1 

So 2 is not a primitive root. Try 3, 
3°. = Bier =. AG 
a =. a600) = 18 
Broo ee BPE SS SS 
5 any | 
BUI: ee 1 

So 3 is looking good so far. 
oe eg 
316 = 49 


Thus we see that 342 = 1 (mod 113), and 3°° 4 1 (mod 113), and 3° 41 
(mod 113). So 3 is a primitive root, and 113 is prime. 


Of course, this method is not interesting for such small numbers. Try 
some of the following exercises with a symbolic computation program. 


Exercises 


1. Show that 19 is a primitive root for p = 191. 
2. Show that 2 is a primitive root for p = 2549. 


3. Let p be prime and let a € Z be a primitive root mod p. Prove that a is 
a primitive root mod p? if and only if a?~! #1 (mod p”). 


4. Let p#q be odd primes. Prove that there are no primitive roots mod 
Pq. 


5. Let p,g,r be pairwise distinct primes which are not necessarily odd. 
Prove that there are no primitive roots mod pgqr. 


6. Find a primitive element of Z579,3. Give a short list of congruences that 
prove that it is a primitive root, and hence that 27943 is prime. You 
can use computer software. 


7. Find a primitive root for p = 1423554023 using computer software. Give 
a short list of congruences that prove that it is a primitive root, and 
hence that p is prime. 
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Notes on Chapter 2 


Linear Diophantine equations were discussed by the Greek mathematician 
Diophantus in the 38rd century CE, though he did not have a complete 
solution. The Hindu school in India studied these equations in the 6th and 
7th century CE, and Brahmegupta had a method for finding a solution. 
It was in the 16th and 17th centuries that the Europeans wrote about it. 
Euler gave a complete solution in the modern style in 1734. It was Gauss 
who introduced the modern notation of congruence modulo n. 

The abstract notion of a ring was given by Fraenkel in 1914 and extended 
by Sono in 1917. However many concrete examples such as Z,, were well 
known much earlier. The first non-commutative example was the ring of 
quaternions due to Hamilton in 1843. Cayley considered the space of n x n 
matrices as a ring in 1855. See for more on this history. 

The Chinese remainder problem, as the name suggests, first arose in 
Chinese writings from the first century CE. The Greek and the Indian schools 
also studied this problem. A complete solution was provided by Yih-hing 
in 717 CE. The Arab school has writings on it from about 1000 CE. The 
Italians wrote about partial solutions in the late 12th century. A German 
manuscript from the 15th century produced the same solution as Yih-hing. 
The modern solution in complete generality was given by Euler, and also 
Gauss, in the mid-18th century. 

Fermat’s little theorem was stated by Fermat in 1640. Euler gave a proof 
of it in 1736, and the generalization to Euler’s theorem in 1760. 

Much information about this history can be found in the volume by 
Dickson [9} Vol.IT]. Kleiner is another source worth reading. Cooke 
contains a lot of information of mathematics before the modern era. 

See Hardy and Wright for all of this material and many extensions, 
plus many historical notes. Stark is also an excellent source for this 
material. 


Chapter 3 


Diophantine Equations and 
Quadratic Number Domains 


Diophantine equations refer to equations or systems of equations in which 
both the coefficients and the unknowns are integers. Generally, there are 
more unknowns than equations. But since we are interested in integer so- 
lutions, it is often difficult to decide if there are any solutions at all. The 
most famous Diophantine equation is Fermat’s equation 


for n > 3. Fermat wrote in the pages of a book (circa 1637) that he had a 
truly marvelous proof that there are no solutions, but it was too long to fit 
in the margin. However, there is no way to know for certain if he really had 
such a proof. Fermat never published anything in mathematics, nor did he 
often communicate his methods to others. It is revealing, however, that he 
wrote to others that he had a proof for the case n = 4, but never claimed 
to have a general proof in his correspondence. 

Euler solved the case n = 3 in 1770. Legendre and Dirichlet indepen- 
dently solved the case n = 5 around 1825. Sophie Germain was a self-taught 
French mathematician in the late 18th century, a time when women were not 
welcomed into academic circles. She corresponded with Lagrange, Legendre 
and Gauss under a pseudonym. She did some important work on Fermat’s 
problem which was unpublished, but was mentioned by Legendre. Some of 
her results were still being reproved by others in the 20th century. 

The early development of abstract algebra, especially rings and fields, 
was in part motivated by an attempt to solve Fermat’s problem. Several 
‘proofs’ were found to be incorrect because they falsely assumed unique 
factorization in certain number domains. Kummer was the first to provide 
a solution for infinitely many primes in 1847, based on an analysis of the 
failure of unique factorization. His proof works for regular primes, which 
includes all primes less than 100 except 37, 59 and 67. 
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Exciting news reached the mathematical community in June 1993 when 
Andrew Wiles announced the final dramatic step to the solution of this 350- 
year-old problem at a conference in Cambridge. The statement of his actual 
results do not immediately look like they apply to Fermat’s question, as they 
refer to some advanced notions about elliptic curves. Indeed, his results are 
much more far reaching than a single equation such as Fermat’s. It turned 
out that there was a gap in part of his proof. He and Richard Taylor 
worked on the gap and eventually completed the argument. In particular, 
these results combine with known work to finally resolve the most famous 
mathematical conundrum of our time. 

In this chapter, we will look at a few special cases of Diophantine equa- 
tions, and will see a variety of techniques for solving them. We also will 
take an excursion into some other number systems to see that the theorems 
we proved in the last chapter are indeed special. The quadratic number 
domains have a nice theory which imitates, yet varies from, the integers. 
Several of these domains have applications to the number theory of the inte- 
gers themselves. We finish the chapter with a proof of Gauss’s famous Law 
of Quadratic Reciprocity, which allows one to calculate whether a number 
a is a square modulo a prime p. 


3.1. Pythagorean Triples 


In this section, we will study the well known problem of determining all of 
the integer solutions of the Pythagorean equation 


Of course, if (x,y,z) is a solution, then (ax,ay,az) is also a solution. So 
it is natural to insist that gcd(z,y,z) = 1. Of course, any integer which 
divides any two of x,y,z divides the third as well. So, it suffices to say 
gcd(x,y) = 1. 

We will give two characterizations of such (x,y,z). The first uses an 
algebraic approach, while the second uses a geometric method. 


Algebraic approach. The first observation is obtained by looking at 
squares of odd and even numbers. All such squares are congruent to 0 
and 1 modulo 4 respectively. Thus the sum of two odd squares is congruent 
to 2 (mod 4), and no square has this form. Since we have ruled out the case 
of x and y both being even by assuming that they are relatively prime, it 
follows that one, say x, is even, and the other, y, is odd. Hence, z is also 
odd. 
Now consider the equation 


3.1. PYTHAGOREAN TRIPLES 61 


Since x, z+ y, and z — y are all even, there are positive integers a, b,c so 
that 


x = 2a, z+ty=2b and z—y=2c. 
Our equation becomes 


4a? =4be or a? =be. 


Now gcd(b,c) divides gcd(b + c,b — c) = ged(z,y) = 1. Thus b and ¢ are 
relatively prime. But bc is a perfect square, meaning each prime factor 
occurs an even number of times. As b and c have no common factors, they 
must both be squares. Let u and v be positive integers such that b = u? 
and c = v?, and thus a = uv. Substituting back in yields 


x= 2uv y=uw—v eau ty". 


Furthermore, gcd(u,v) = /gcd(b, c) = 1. Since y is odd, exactly one of u 
and v is odd. 

On the other hand, if u > v are relatively prime, one even and one odd, 
then « = 2uv, y = u2 — v? and z = u? + v? are relatively prime, and satisfy 


a? ty? = 4u?y? + u4 — Qu2v? + v4 = ut 4 Qu?u? + v4 = 2”. 


This solves the problem completely. 
For the general solution of Pythagorean triples, one must put the com- 
mon factors back in. So the most general solution is given by 


B= lhe y = k(u? — v”) z= k(u? + v") 


for arbitrary integers u, v, and k. Note however that to get gcd(z,y) = k, 
we need to specify that exactly one of u,v is even and gcd(u, v) = 1. 


Geometric approach. Observe that (x, y, z) is a solution if and only if the 
point (az, ay, az) is a rational solution for all a £ 0 in Q. If we take a = , 
we obtain a rational solution (4, a 1). Conversely, if (2, y,1) is a rational 
solution, then clearing the denominator yields integer solutions. Therefore, 
it is enough to classify (x,y) € Q? with 7 + y? = 1. We see that Q = (0,1) 
is such a solution. 
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Consider the line ¢; through (0,1) with slope t, and let P; be the inter- 
section of 4; with the circle x? + y? = 1. 


x 


P, 


f 


v 


Let’s express P; in terms of t. The line & is given by y = t2+1. Substituting 
this expression for y into x? + y? = 1, we find 


Lo? + (te $1) = (8 4 1)a7 + 2te + 1. 


The two solutions to this equation are x = 0 and 


2t 
(3.1.1) er ae 
Plugging back into y = tr + 1, we have 
(3.1.2) pee ee ea 
?+1 
Therefore, 


% LH 
P= ( ). 
?+1°t?41 


Notice that if t € Q, then P; € Q?. Conversely, suppose P; € Q? and 
P, # (0,1). Then since t is the slope of the line between P; and (0,1), we 
see t € Q. Hence, we have shown 


{(x,y) €Q?: a7? +y? =1} ={PB:te QhU{(0,1)} 


oF, 
= ‘ PES \. 
{( {?+1 eee, Q 


Note that setting t = 0 yields the point (0, 1). 
Now, set t = § with a,b € Z relatively prime and b £ 0. Then 


1=( =); )’ (iY =( 2ab =) 
~ Mei) * M2 +1/ ~ \a2 +0? az + b2/) ° 
Multiplying through by a? + b?, we see that all of the integer solutions of 


x? + y? = z? (up to scalar) are given by 
(2ab, a? — b?, a? + 6). 
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The key geometric trick which made this argument work was to find one 
rational solution Q of x? + y? = 1 and then parameterize all other solutions 
by intersecting our equation with a rationally sloped line through Q. The 
reader may wonder if Diophantine equations other than x? + y? = z? may 
also be solved using this method. This question forms part of a beautiful 
subject known as Arithmetic Geometry. Equations of degree 3 in 2, y, z 
are objects known as elliptic curves; rational points on elliptic curves is a 
subject of active research and has deep connections to the Fermat equation 
mentioned at the beginning of this chapter. For equations of degree at least 
4 in x,y,z, a theorem of Faltings shows that there are only finitely many 
rational solutions. Faltings was awarded the Fields Medal for his seminal 
work on this subject. 


Exercises 


1. Show that there are infinitely many relatively prime solutions of 


og +y? = 2". 
Find all solutions of x? + 3y? = z?. 


Find all relatively prime solutions of 2? + 2y? = z?. 


Use point Q = (1,1) to find all rational points on the circle x? + y? = 2. 


et cae OPS 


Solve the Diophantine equation «? + 44? = 2°. 


3.2. Fermat’s Equation for n = 4 


The complete solution of the Pythagorean triple problem allows us to analyze 


the Diophantine equation 
2 


a+ y" = 2. 
It will be shown that this has no solutions. Hence the Fermat equation 
at +yt = 24 
has no solutions either. 

The method of proof is called Fermat’s method of infinite descent. The 
basic idea is to start with the smallest possible solution (if it exists), meaning 
that z is as small as possible. Then using the given solution, construct a 
smaller solution. Of course, this is a contradiction which implies that the 
assumption that there were any solutions at all was wrong. This is called 
infinite descent because one can construct an infinite sequence of smaller 
and smaller solutions, which is not possible. 


3.2.1. Theorem. The equation «+ + y* = z? has no positive integer 
solutions. 
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Proof. So, let us assume that there are solutions of x4 + y4 = z? in positive 
integers. Among all solutions, we choose x,y and z so that z is minimal. In 
particular, gcd(x,y) = 1. Since x?, y?, and z is a Pythagorean triple, there 
are relatively prime positive integers wu and v such that 
a? = 2uv yi =u? —v" gu? +0". 

(It may be necessary to interchange x and y so that x is even, and y is odd.) 

This produces another Pythagorean triple v? + y? = u?. Thus, v must 
be even, as y is odd. Consider the equation x? = (2v)u. As ged(u, 2v) = 1, 
it follows that u and 2v are squares. Hence there are positive integers a and 
b so that u = a? and v = 207. 

Using the solution for Pythagorean triple system v?+y? = u 
relatively prime positive integers c and d so that 


2 we obtain 


v = 2cd y=-ad u=e+td. 


Hence, b? = cd and a? = c? + d?. Once again, since b? = cd and gcd(c, d) = 
1, it follows that c and d are perfect squares, say c = m? and d = n?. 
Substituting back in yields 
m! + n‘ =a’. 

Finally, a <a? =u<u?4v7% =z. 

So, we have succeeded in producing a smaller solution of our equation, 
contrary to the hypothesis that we started with the smallest one. This must 
imply that there are no solutions at all. | 


Exercises 


1. Show that there are no positive integer solutions to x4 + 4y4 = z?. 


2. Show that there are no positive integer solutions to 2+ — y* = z?. 


3. Show that there is no right angle triangle with sides of integer lengths 
whose area is a perfect square. 


4. Solve x? +12 = y' for z,y EN. 


Show that if 2, y,p € N with p is prime and x? + y? = p, then p = 2. 
What changes if we allow x,y € Z? Find a few solutions. 


6. Find all integer solutions of x? — 1ly? = 3. 
HINT: solve it mod 8 first. 


7. Find all integer solutions of x4 + y* = 1324. 
HINT: try to solve it modulo some primes. 
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3.3. Quadratic Number Domains 


A number d is called square free if it has no repeated prime factor. Let d 
be a square free integer (except 1). Define 


Z[Vd] = {n+ mvVd:n,m€ Z}. 


One may check directly that this set is closed under addition and multipli- 
cation; and thus is a commutative ring. It also has the important property 


if x,y € Z[Vd] and ry=0 then x=Oory=0. 


This follows since Z[Vd] is contained in the real numbers R (when d > 0) 
or the complex numbers C (when d < 0), both of which have this property. 
In other words, Z[Vd] is an integral domain. 

In fact, Z[Vd] sits inside a smaller field 


Q[Vd] = {r+ sVd:r,s € Qh. 


One checks that Q[{Vd] is an integral domain. To see that non-zero elements 
have inverses, notice that 


Maa =. 


r2 — ds? 
It is a simple exercise based on the irrationality of Vd to see that 


rt+sVd=a+bvd 


implies that a = r and b = s for all rational numbers a,b,r and s. In 
particular, r? — ds? 4 0 unless r = s = 0. Now we will introduce an 
important function which will make computations possible. 


3.3.1. Definition. For x = r+ sVd € Q[Vd], define the conjugate of x 
to be = r—svVd. Let the norm of x be N(x) = 2% = r? — ds?. 


Note that if « € Q[Vd], then N(x) is rational. If x € Z[Vd], then N(z) 


is an integer. 


3.3.2. Lemma. For x,y € Q[Vdl, 
aye 


1) 
(2) : 

(3) N(xy) = N(x)N(y). 

(4) N(x) =0 if and only if « =0. 


Proof. The proof consists of straightforward computations, and will be left 
to the exercises. | 
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Recall from Definition that a unit x of a commutative ring R is an 
element with an inverse y, i.e., cy = 1. In Z[V2], a simple calculation shows 
that 


(17 + 12\/2)(17 — 12/2) = 1. 


So 17+ 12V2 is a unit in Z[/2]. We need a criterion to decide when some- 
thing is a unit. 


3.3.3. Proposition. An element x €Z[Vd] is a unit if and only if N(x) = 
+1. 


Proof. If zy = 1, then N(x)N(y) = N(1) = 1. But N(x) and N(y) are 
integers, so they are both +1. Conversely, if N(a#) = +1, y = N(x)z satisfies 
zy = N(x)? =1. So z is a unit. a 


This proposition shows that the units if Z[Vd] correspond exactly to 
integer solutions of Pell’s equation 


n? = dm? =+1. 


When d is positive, there are always infinitely many solutions. We will look 
at a few special cases in the next section. When d < —2, only +1 are units. 
The case d = —1 is special. See section B.5) 


3.3.4. Definition. In a quadratic number domain, an element z is called 
a prime if (i) x is not a unit, and (ii) whenever x factors as x = ab, either 
a or b is a unit. 


3.3.5. Remark. Notice that this definition is a special case of the one 
given in Definition [1.8.6] For rings more general than quadratic number 
domains, the term “irreducible” is used instead of “prime.” 


One can factor 2 in Z as 


2 = (1)(2) = (2)(1) = (-D(-2) = (-2)(-1). 
We consider these to be trivial factorizations because one factor is always a 
unit. The primes in Z by this definition are just the ordinary primes and 
their negatives. 
The following lemma gives us a simple test for primes. However, the 
converse is not true; so be careful how you use it. 


3.3.6. Lemma. [f N(x) is prime, then x is a prime. 


Proof. If z = ab, then N(x) = N(a)N(b). If N(x) is prime, then either 
N(a) or N(b) equals +1. Hence either a or 6 is a unit by Proposition [3.3.3 
Therefore x is prime. a 
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3.3.7. Example. Consider Z[\/2] again. Since N(2 + V2) = 2 is prime, 
2+,/2 isa prime. The number 2 itself is not prime! It factors as 2 = V2V2, 
and N(/2) = -2 4 +1. Also 7 is not prime because 7 = (3 — 2)(3 + V2). 

The integer 5 is prime in Z[/2], even though N(5) = 25 is not prime. 
If 5 were not prime, it would factor as 5 = xy, where neither x nor y isa 
unit. Then 25 = N(5) = N(x)N(y). Since x and y are not units, neither 
N(a) nor N(y) equals +1. Thus, one must have N(x) = N(y) = +5. Let 
us write x =n +mvV/2. Then 


n? — 2m? = +5. 


This is impossible. To see this, consider this equation modulo 5. One obtains 
n? =2m? (mod 5). 


However, the squares modulo 5 are congruent to 0, 1 or 4. Thus the only 
solution occurs when 


n=m=0 (mod 5). 
Thus the only way that n? — 2m? can be a multiple of 5 is if both n and m 


are multiples of 5. Then n? — 2m? is a multiple of 25; and so never equals 
+5. We conclude that 5 is prime in Z[V/2]. 


Let us show that every element of Z[Vd] has at least one factorization 
into primes. Later, we will discuss what unique factorization should mean. 


3.3.8. Lemma. Every non-zero element of Z[Vd] factors as the product 
of a unit and finitely many primes. 


Proof. The proof is basically the same as the proof we gave for the integers. 
The size of elements of Z[Vd] will be measured by the norm function. 
Consider the set 


S = {x € Z[Vd] : x does not factor as a finite product of primes}. 
If this set is empty, the lemma is true. Otherwise, the set 
{|N(a)|: 2 € S} 


has a smallest element. Let 2 be an element of S for which |N(x)| is as 
small as possible. If x were prime, it would factor as the product of one 
prime and so would not belong to S. Hence x factors as x = ab so that 
|N(a)| < |N(a)| and |N(0)| < |N(a)|. Therefore, both a and b must factor 
as products of primes, say 


a=up,...p,p, and b=vq...q, 
where u and v are units and p; and q; are all primes. But then 


x= (uv)pi..- Pegi... 


68 3. DIOPHANTINE EQUATIONS AND QUADRATIC NUMBER DOMAINS 


is the desired factorization of x. This contradicts the fact that x belongs to 
S. We conclude that S is empty and the lemma is true. | 


What does unique factorization mean in this context? Consider 
11 = (5V3 + 8)(5V3 — 8) = (2V3 — 1)(2V3 +1). 
Notice that 
N(5V3 +8) = N(2V3 +1) = 11. 
So the factors are prime. Are they really two different factorizations of 11 


in Z[V3]? No, they’re not. Notice that 2—/3 is a unit with inverse 2+ V3. 
Now 


(V3 #8)(24/3) = 2/73 +1. 


So these two primes are in the same relationship here as +5 are in Z. Two 
primes p and q are called associates if there is a unit u such that q = up. 
So we can compute 


11 = (5V3 + 8)(5V3 — 8) 
((5V3 + 8)(2— V3) 
(2/3 + 1)(2V3 — 1) 
= (2V3 — 1)(2V3 +1). 


These two factorizations are essentially the same because the only difference 
is obtained by multiplying primes by units, and permuting the factors. 
On the other hand, consider the following situation in Z| 10]. 


6 = (2)(3) = (44+ V10)(4— v/10). 


We compute that N(2) = 4, N(3) = 9, and N(4+ 10) = 6. If these 
numbers factor non-trivially in Z[V/10], then there would be elements 7 = 
n+mvy/10 with N(x) = n? — 10m? = +2 and N(x) = +3. However, 
reducing modulo 10, this requires that n? = 2,3,7 or 8 (mod 10). But a 
perfect square is congruent to 0, 1, 4, 5, 6, or 9 (mod 10). Therefore 2, 3 
and 4+ 10 are primes in Z[V10]. Neither 2 nor 3 is an associate of 44/10 
because their norms are different. So the domain Z[V10] does not have the 
unique factorization property. 

A domain in which every element has exactly one factorization into 
primes up to permutations and multiplication by units is called a Unique 
Factorization Domain or UFD. The key is the analogue of Lemma 
Some of these domains have a Euclidean algorithm, which is easily deduced 
if there is a division algorithm. See Section [1.8] for an introduction to these 
ideas. Try it out with « = 2 and y = 4+ V/10 to see that this does not 
hold in Z[V10]. It is an interesting and difficult problem in number theory 
to determine which quadratic number domains are Euclidean, and which 
are UFD’s. There are only finitely many Euclidean domains. There are 
more UFD’s, and it is conjectured that there are infinitely many of them. 


)((2+ V3)(5v3 — 8)) 
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The interested reader should consult a book on number theory to get more 
information. We recommend Stark [37]. 

There is one more subtle point. The ring Z[/5] is not a UFD. To see this, 
notice that 4 = (V5+1)(vW5—1) = (2)(2). Also, all the factors have norm 4. 
We see that n? — 5m? = 2 has no solutions by looking at this equation mod 
5. Clearly, 2 and V5+1 are not associates. Thus factorization is not unique 
in Z[V/5]. However, in this case, the reason is that we left some important 
elements out of our ring. All the numbers « = n+ my/5 satisfy a monic 
quadratic equation with integer coefficients, namely 


X? — nX + (n? — 5m?) =0. 


However, the element (1 + /5)/2 belongs to Q[V5], and is a root of X? — 
X —1. The collection of all numbers in Q[V5] satisfying such an equation 
turns out to be all numbers of the form (n + mv/5)/2 where n and m are 
integers such that n = m (mod 2). In this case, N(x) = a € Z. In 
the larger ring Z|), there are the units (1 + /5)/2. It is known as the 


ring of integers in Q[V/5] because this is the set of all elements in Q[V5] 
[4] 
2 


with integer norm. Now 2 and 1+ V5 are associates in Z . In fact, 
Z| 14v5) is a Euclidean Domain. 

It can be shown that the ring of integers in Q[Wd] is Z[Vd] when d # 
1 (mod 4), and Z[ 144) when d = 1 (mod 4). Moreover, when d = 1 


(mod 4), Z[V/d] can never be a UFD. To see this, let d = 4k +1. Notice that 
2|4k = (1+ Vd)(-1+ Va). 


We claim that 2 is prime in Z[Vd]. It has norm N(2) = 4, so any proper 
factor must have norm +2. Consider the equation 


+2=n?—dm?=n?—m? (mod 4). 


The left-hand side is congruent to 2 (mod 4), which can never be the differ- 
ence of two squares. Now the prime 2 divides (1+ Vd)(—1+ Vd), but does 
not divide either +1 + Vd. So there is no unique factorization. 

The list of the rings of integers of Q[Wd] which are Euclidean domains 
with respect to the norm function is finite: d= 


(i= 7 3 Se 18S 5. 6 711 13, 1% 19, Oe 90, 35, 87, 41, By, 73. 


The list of Euclidean domains for some other function includes d = 14, and 
may be infinite. The list of UFDs is larger, and is almost surely infinite. The 
negative values of d are all known though, and there are only finitely many. 
In addition to the norm Euclidean domains, there are —163, —67, —43, —19. 
The additional positive ones with d < 100 are 


14, 22, 23, 31, 38, 43, 46, 47, 53, 59, 61, 62, 71, 77, 83, 86, 89, 93, 94, 97. 
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Exercises 


1. Show that n+ mvVd =k+IVd for k,l,m,n € Q implies that k =n and 
l=m. 


Verify Lemma[.3.2 


3. (a) Show that 2 and 3 are not prime in Z[V3]. 
(b) Show that 5 — 2/3 is prime in Z[V3]. 
(c) Show that 5 is prime in Z[V3]. 

4. Show that there is no division algorithm for Z[V10] with f(x) = |N(«)| 
by showing that any remainder on dividing 4+ /10 by 2 has norm with 


absolute value at least 6. 
HINT: consider the norm of the remainder modulo 20. 


5. Show that there are infinitely many integer solutions of n? — 3m? = 1. 
Find an explicit recursion formula that generates your set of solutions. 


6. Show that n? — 5m? = 2 has no solutions. 


3.4. Pell’s Equation 
The units (invertible elements) of Z[Vd] are of the form x + yVd such that 
x = dy? =F 1, 
For d positive, one might suspect that there are non-trivial solutions. In 
fact, there are always infinitely many solutions for every positive square free 
d. The proof of this is beyond the scope of this book. If you are interested, 
consult [37]. The proof is based on the theory of continued fractions. Brute 
force is not likely to succeed with this problem because some fairly small 


numbers have very large smallest solutions. For example, for d = 109, the 
smallest solution is 


x = 158070 671 986 249 y = 15140 424 455 100. 


This problem has a long history, and it was completely solved in 1150 
by Bhaskara. Fermat solved it for d < 150 and challenged a group of British 
mathematicians to solve certain larger numbers. This was done by Broukner, 
but later falsely attributed to Pell by Euler. It seems that Pell was not 
responsible for either the problem or its solution, but his name has stuck. 

In this section, we will solve the special case 


a? = by? = +1. 


We see x = 2 and y = 1 gives a non-trivial solution. This means that 2+ /5 
is a unit in Z[/5] of norm -1. Thus any power of it is a unit (with norms 
alternating +1). That is, the pairs {+2,,+y,} obtained from 


Bn + Ynv 5 = (2+ 75)" 
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are solutions. The even pairs {+22n,+yon} are solutions of x? — 5y? = 1, 
and the odd pairs {+2n+41, +yen41} are solutions of 2? — 5y? = —1. The 
method of descent can now be used to show that this list of solutions is 
complete. Indeed, this idea can be used for any d to show that if Pell’s 
equation has one non-trivial solution, then it has infinitely many. See the 
exercises. 


3.4.1. Theorem. All solutions of the equation x? — 5y? = £1 are given 
by the pairs {+an,+yn} for n > 0 obtained from the identities 


This leads to the recursive formulae 


to=1, yw=0 and p41 = 2%, + 5Yn, Ynt1 =In+2yn forn>0. 


Proof. First note that from the previous discussion, the pairs {+x,, +yn} 
for n > 0 are indeed solutions. From this formula, we obtain 


En+1 + Yntiv5 = (tn + ynv'5)(2 + V5) 
= (22 + 5Yn) + (fn + 2un)V5. 


So the recursive equations for 7,41 and yp+41 follow immediately. 

Suppose that the set S of non-negative integer solutions which are not in 
this list is non-empty. We can then choose the solution {x,y} so that y is as 
small as possible. The plan is to use Fermat’s method of infinite decent to 
show that there is a smaller solution in S$, a contradiction and hence S = @ 
and our list must be complete. The idea is that «+ yV/5 is a unit in Z[/5], 
as is (2+ /5)7} = \/5 — 2. Hence, 


(x + yV5)(W5 — 2) = (Sy — 2x) + (x — 2y)V5 


is a unit. Thus, {5y — 2x, x2 — 2y} is a solution. 
The rest of the proof is just a computation to show that this is indeed a 
smaller positive solution that is not in our list. Since 


Ay? < 5y?+1=27 < by’, 
it follows that 2y < x < /6y. Hence, 
0 < (5—2V6)y < 5y— 22 < 5y—4y = y, 


and 

0 = 2y —2y < a-2y < (V6 —-2)y <y. 
Consequently, we have obtained a smaller non-negative solution than we 
started with. This solution cannot be {z%p,y,} from our list. For then, 


a+yv5 = ((5y — 2x) + (x — 2y)V5)(2 + V5) 
cs ar YnV5) (2 ar v5) 
Inti + Yn+iV5. 
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Hence (5y — 2x, 2 —2y) € S, and 0 < x—2y < y, contradicting the fact that 


(x,y) had the smallest 2nd coordinate in S. Therefore we have obtained the 
desired contradiction. | 


Exercises 


1. Find all solutions of 2? — 2y? = +1. 


2. Show by induction that the positive solutions of x? —5y? = +1 obtained 
above are given by the formulae 
(2+ V5)" + (2— v5)" 


= 5 and Yyn= 


(2+ V5)" — (2— V5)" 
2/5 , 


The notation [x] and |x| mean the least integer n > x and the least 
integer m < x respectively. Deduce that 


tn = [(2+ V5)"/2] and yn = [(2+ V5)"/2V5]. 


3. (a) Show that the elements of Z[14¥5) are all elements of the form 
a+tb/5 
2 


n 


where a = b (mod 2). 

(b) Show that the set of units of Z[ 14-5) have the form Ban esa where 
untvnv’s = (434)" for n> 0. 

(c) By Theorem[B.4.1] 2+ V5 is a unit. Where does it fit into this list? 


4. Show that there are infinitely many Pythagorean triples with y = «+1; 
i.e., solutions of the form (z,x+ 1, z). 
HINT: reduce it to Pell’s equation for d = 2. Hence find the smallest 
solution larger than 6967 + 6977 = 985?; i.e. z > 985? 


5. Prove that if x? —dy? = 1 has one positive solution, then it has infinitely 
many. If x?—dy? = —1 has one positive solution, then it and x?—dy? = 1 
have infinitely many solutions. 


6. Show that n? — 5m? = 11 has infinitely many solutions. 
HInT: this is the norm of an element in Z[V5]. 


7. Show that there are infinitely many positive integers a such that both 
a+1and 3a+1 are perfect squares. 
Hint: reduce this to a question of elements in Z[/3] with specified 
norm. 


3.5. The Gaussian Integers 


When d < 0, the ring Z[Vd] lies in the complex numbers C, not the reals. 
For this section, some familiarity with complex numbers will be assumed. 
The ideas of complex numbers will be formally introduced in Chapter[5] We 
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use the notation i = /—1 for one (fixed) square root of —1. The Gaussian 
integers Z[,/—1] consist of all complex numbers of the form n + mi for 
integers n and m. The norm function is N(n + mi) = n? + m?, and this is 
always a positive integer. 

Let us find all of the units. For if u = n+mi is a unit, then n2+m? = 1. 
Hence one of n or m is 0 and the other is +1. So the units are +1 and +7. 

We wish to establish unique factorization in this domain. By Theorem 
and Remark it is enough to show that the Gaussian integers 
are a Euclidean domain for the norm function, i.e. they have a division 
algorithm. 


3.5.1. Proposition. Suppose that a,b € Z[/—1], anda #0. Then there 
are elements q,r € Z[,/—1] such that b=aq+r and0< N(r) < N(a). 


Proof. Since b/a € Q|y—I], it can be written as b/a = u+iv where u and v 
are rational. Pick integers n and m so that |u—n| < 1/2 and |vu—m| < 1/2. 
Set g=n+im, and 

r=b-—agq=a(ut iv) —a(n+im) =a((u—n) +i(v—m)). 
Then using the fact that N() is defined on Q[V—1], 


N(a) 
2 


Nir) = N(a)(|u — nl? + |u— m|’) < 
Thus the remainder r is sufficiently small. | 


3.5.2. Theorem. Unique Factorization for Gaussian Integers. 
Suppose that a is a non-zero element of Z[,/—1], and that it factors in two 
ways: 


Q=Upl...Pk=UN---U; 


where u and v are units and p; and qj are all primes. Then k = 1 and there 
is a permutation nm so that pj and qy(;) are associates for 1 <i<k. 


Proof. By Proposition 3.5.J]and Remark[L-8.9] the hypotheses of Theorem 
1.8.18] hold. So, the Gaussian integers have unique factorization. | 


In this ring, it is possible to describe all the primes. The argument 
will be split into two parts. The first theorem is of independent interest. 
The reader should notice that if item were omitted from the list of 
equivalences, it would not appear to have anything to do with the Gaussian 
integers. However, the Unique Factorization theorem for this ring is crucial 
to the proof. 
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3.5.3. Theorem. Let p be an odd prime. Then the following are equiva- 

lent: 
(1) p=1 (mod 4). 

(2) 27+1=0 (mod p) has a solution. 

(3) There are integers n and m which are not multiples of p so that 
pln? + m?. 

(4) p is the sum of two squares: p = a? + b?. 

(5) p factors as p= (a+ib)(a— ib) in Z[V—I]. 


Proof. Suppose that (1) holds, and write p = 4n +1. Let a = (2n)!. Then 
by Wilson’s Theorem, 


a = ( 1 i) (IIa) 0 
j=l j=l 
=[[san+1-3) 
j=l 


So a is a solution of (2). 

If n is a solution of 2? + 1 = 0 (mod p) and m = 1, then n? +m? is a 
multiple of p, so (3) holds. 

Suppose that (3) holds. Notice that in Z[/—1], it is possible to factor 
n?2+m? as (n+im)(n—im). If p were prime in Z[,/—1], it would divide one 
of n tim. This then implies that p divides both n and m, contrary to fact. 
Hence p has a proper divisor x € Z[,/—1]. It follows that N(x) is a proper 
divisor of N(p) = p?. That is, N(x) = p. If x = a+ ib, then p = a? + Bb’. 
This proves (4) and (5). Since a? + b? = (a + ib)(a — ib), we see (5) implies 
(4). 

Finally, since p is odd, one of a, b is even and the other is odd. Therefore 
p =a? +b? =1 (mod 4). So (4) implies (1). a 


3.5.4. Theorem. The primes in Z{/—1] are: 


(1) The elements of prime order: the primes +1 +i of norm 2; and 
the elements x with N(x) = p, where p is a prime congruent to 1 
(mod 4). 

(2) The elements +p and tip where p is a prime integer with p = 3 
(mod 4). 


Proof. By Lemma[B.3.6] it follows that if N(x) is prime, then x is prime. 
For N(x) to be prime, x cannot be an integer or 7 times an integer (these 
elements have square norms). So x = a+ ib, and a and b are not both even 
(because 2 does not divide x.) Hence p = N(x) = a? + b? is the sum of 
squares, not both even. Thus, it must be congruent to 1 or 2 modulo 4. 
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Now 2 is the only prime congruent to 2 (mod 4), and one checks that 
1 +7 are the only elements of norm 2. The others have odd prime norm. 
Suppose that p is an integer prime congruent to 3 mod 4. If this were not 
prime in Z[,/—1], it would factor as p = xy, say. But then 


p’ = N(p) = N(x) N(y). 
Neither N(x) nor N(y) is 1, so N(x) = N(y) = p. But p = 3 (mod 4), 
and this is impossible for a norm which is a sum of two squares. Hence p is 
prime in Z[,/—1]. Its associates +p and +ip are then also prime. 
It remains to show that there are no other primes. Let x = n+ mi be 
a prime in Z[/—1]. Its conjugate Z = n — mi is also a prime. To see this, 


notice that x = ab if and only if = Gb. So any factorization of & into 
proper factors implies that x also factors, contrary to fact. 

Consider N(x) = x%. If this is prime, it falls into case (i). Otherwise, 
N(«a) factors non-trivially in the integers as 


2a = Na) = pea. 


Now we can apply the Unique Factorization Theorem. The left-hand side 
is the product of two primes. So the right-hand side must also be a fac- 
torization into primes. Furthermore, x is the associate of one, say p, and 
Z is the associate of the other, g. But if u is a unit so that x = up, then 
<= up = up. Hence p is an associate of £, and hence also an associate of q. 
This means that p = q is a prime. 

There are two cases. If 2 = +p or ip, this falls into case (ii). Otherwise, 
x = n-+im, where n and m are not multiples of p, but n? + m? = p? is 
divisible be p. So by Theorem[3.5.3] p = 1 (mod 4). But then, by the same 
theorem, we find out that p (and so also x) is not prime in Z[/—1]. That 
eliminates this final possibility. | 


A pretty application of this is a complete description of which numbers 
can be expressed as the sum of two squares. The key additional piece of 
information needed is the following computation. The proof is left to the 
reader. 


3.5.5. Lemma. Leta, b, x, and y be integers. Then 
(a? + b°) (a? + y?) = (aw + by)? + (ay — bx)? 


3.5.6. Theorem. Let n be a positive integer. Factor n as n = ab* where 
a is square free. Then n can be expressed as the sum of two squares if and 
only if a has no prime factors congruent to 3 (mod 4). 


Proof. First suppose that a has no prime factors congruent to 3 (mod 4). 
By Theorem [3.5.3] each factor of a is the sum of two squares. Repeated 
application of the lemma shows that their product is also the sum of two 
squares. Finally, multiplying by b? preserves this as the sum of two squares. 
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Conversely, suppose that n = 2? + y?, and that p is a factor of a. Let 
k be the largest power of p which divides both x and y. Set X = a/p*, 
Y = y/p*, and N = n/p?*. Then since an odd power of p divides n, N is 
still a multiple of p but X and Y are not. By Theorem[3.5.3] p = 2 or p=1 
(mod 4). a 


3.5.7. Example. As a second application, let us consider a Diophantine 
equation: 

gi +4= 2%. 
It is convenient to work in Z[,/—1] rather than in the integers because x? + 
4 = z3 factors to obtain 


2° = (# + 24)(a — 2%). 


First suppose that each of «+ 22 are cubes, so that there are integers a and 
b with 


a +2i = (a+ bi)? = (a? — 3ab”) + i(3a7b — b°). 


Hence, 

(307 — b7)b = 2. 
Since b divides 2, it must be +1 or +2. Checking each case provides the 
solution a = £1, b = 1 or —2. Therefore x = a? — 3ab? = +(1 — 3b?) € 


{+2,+11}. This yields the two positive solutions 
274+4=2> and 1174+4=5°%. 


Let us show that these are the only solutions. Suppose that (x,y,z) isa 
positive solution. Let p be a prime in Z[,/—1] which divides z. If p™ is the 
greatest power of p which divides z, then p®” divides z?. If p divides only 
one of 2 + 2%, say x + 2i, then p?”, which is a perfect cube, divides x + 2i. 
However, p might divide both x + 27. In that case, it divides 


gcd(x + 21, x — 21) = ged(x + 23, 4). 


Since 4 = —(1+7)*, this means p = 1+7. Now p is associated to —i(1+%) = 
1—i=9p. Thus if p* divides x + 2i, then p* divides (a + 21) = x — 2i. That 
is, the multiplicity of p as a factor of x + 2i and x — 27 are equal. Thus, 3m 
is even, say 3m = 6n. Hence, r+ 2i are both multiples of p?” which is also a 
perfect cube. It follows that except for the factor of a unit, both x + 27 are 
perfect cubes. But the units, +7 and +1, are all perfect cubes. So, x + 27 
are both cubes. Therefore we have found all the solutions. 


Exercises 


1. Factor 1105 completely in Z[/—1]. 
2. Solve the Diophantine equation x? + 44? = 2°. 
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3. Show that Z[,/—2] has a division algorithm. Hence deduce that Z[/—2] 
has unique factorization. 
4. Find all solutions of x? +2 = y?. 
5. Give another argument to find all irreducible Pythagorean triples (a, y, z),JJ 
ie. 2? + y? = 2? and gcd(z, y) = 1, as follows. 
(a) Assume that x is odd. Factor x? + y? = (x + iy)(x — iy) = 2? in 
Z|V—1]. Prove that x + iy is a square. 
(b) Hence find a formula for x, y and z. 
(c) Verify that every triple of this form yields an irreducible Pythagorean 
triple. 


6. (Zagier) Let p be a prime with p= 1 (mod 4). Define 
S={(a,y,z) €N®: 2? + 4yz =p}. 
Also define T': S > S by 


(a+ 2z,z,y—2-2z) if r<y-z 
T(z, y,2) = § (Qy—2,y,2 -—y+2z) if y-z<a<2y 
(a —2y,xr-—ytz,y) if x > 2y. 


(a) Prove that S is finite, T(S) C S and ToT = id. 

(b) Prove that T has a unique fixed point (20, yo, 20) = T'(2o, yo, 20)- 
Deduce that || is odd. 
Hint: Note that a fixed point has the form (x, 2,z), which forces 
x\|p. 

(c) Let J(z,y,z) = (x,z,y). Show that J(S) = S. Using that || is 
odd, prove that J has a fixed point. 

(d) Deduce that p is a sum of two squares. 


7* Find all solutions of x? + 11 = y?. You must work in Z[,/—11], which is 
a Euclidean domain. 


3.6. Quadratic Reciprocity 

Primitive roots can be used to analyze simple congruence equations. Recall 
that a is a primitive root modulo a prime p if {a* : 1 < k < p—1} represent 
all p — 1 distinct non-zero equivalence classes (mod p). Thus every x € Z5 


has the form « = a* for some k. We can use this to solve certain congruence 
equations. 


3.6.1. Example. Consider the equation 
(f) z®°=13 (mod 17). 


Of course, trial and error works for such a small number. However, let 
us instead make use of the fact that 3 is a primitive root of Z7 (because 
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38 = —1 (mod 17)). A calculation shows that 34 = 13 (mod 17). Equation 
(t) has a solution « = 3* if and only if 


eas" =2 Gnod 17). 


Hence 3°*-4 = 1 (mod 17). This occurs exactly when 6k = 4 (mod 16). 
Since gcd(6, 16) = 2 divides 4, equation ({) has the solutions k = 6 (mod 8) 
or k = 6, 14 (mod 16). Thus the solutions are x = 3° = 15 (mod 17) and 
az = 3°38 = 2 (mod 17). 

On the other hand, consider the equation 


z° =3 (mod 17). 
Again if we set z = 3", the equation becomes 
e=3=3' (mod 17). 


This has solutions 7 = 3* satisfying 6k = 1 (mod 16). This has no solutions 
because gcd(6, 16) = 2 does not divide 1. 


The general result along these lines is proved in the same way. The 
added twist is that we obtain a condition that does not use primitive roots! 
However, the existence of primitive roots is used in the proof. 


3.6.2. Theorem. Let p be a prime, let n be a positive integer, and suppose 
that gcd(b,p) = 1. Set s = gcd(n,p— 1) and t = (p—1)/s. Then the 
congruence equation x” = b has solutions if and only if bh’ = 1 (mod p). In 
this case, there are s distinct solutions modulo p. 


Proof. Let a be a primitive root mod p. Let m be chosen so that 6 = 
a™ (mod p). Then «” = b (mod p) has a solution z = a’ if and only if 
a” = a’ = a™, which happens if and only if nk = m (mod p — 1). By 
Theorem [2.6.1] this has solutions exactly when s = gcd(n, p— 1) divides m. 
But s|m if and only if p—1|tm. Since a is a primitive root, a© = 1 (mod p) 
exactly when e is a multiple of p— 1. Thus our equation has a solution if 
and only if 


eg" = (a™)’ =b' (mod p). 
Moreover, the solution of nk = m (mod p — 1) is unique modulo ¢; so that 


there are exactly s solutions modulo p—1. Thus, when solutions exist, there 
are exactly s distinct solutions. | 


We apply this result for n = 2 and p > 2. Note that s = gcd(2, p—1) = 2; 
whence t = po 
3.6.3. Corollary. An number b is a square modulo an odd prime p if and 
only if 
pie-D/2 =1 (mod p). 
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3.6.4. Corollary. x? = —1 (mod p) has a solution if and only if p = 2 
or p=1 (mod 4). 


Proof. For p = 2, 12 = 1=~—1 (mod 2). For p > 2, write p = 4n +e where 
e € {1,3}. The previous corollary shows that —1 is a square modulo p if 
and only if (—1)®-))/? = 1 (mod p). However 


(-e-D?2 =(-1ye-ne2 =] 1 (modp) if e=1 
—1 (mod p) i e=3 


That is, —1 is a square if and only if p = 2 or p=1 (mod 4). a 

Gauss was interested in the problem of deciding when a number 6b was 
congruent to a square modulo a prime p. He gave an elegant solution which 
allows the calculations to be carried out easily by hand. The key result 


became known as the Law of Quadratic Reciprocity. This was one of Gauss’s 
most celebrated theorems. 


3.6.5. Definition. The quadratic residue of a modulo a prime p is 1 
if a is a square modulo p, and —1 if it is not. It is denoted by (£). 
Pp 


The corollary above shows that (<) = a—-Y/? (mod p). Hence it fol- 
Pp 
lows that 


@ = (ab) @-D/? = g&-D/2p0-)/2 = () () (mod p). 


In other words, the quadratic residue is multiplicative. So in order to do 


computations, it suffices to know (2) when p and q are primes. This is the 


content of Gauss’s famous theorem, which we prove below. 


3.6.6 Law of Quadratic Reciprocity. Suppose that p and q are odd 
primes. Then 


2) yt ie if p = +1 (mod 8) 


—1 if p = +3 (mod 8). 


ij = 1 ifp=1(mod 4) or q=1(mod 4) 
~ )-1  ifp=q=3 (mod 4). 


aac 

IS) 

“—" 
7. 
21°93 
Nie 
a 
SIR 
bau 

lI 


The quantity za appears here. If p= 8a+1, then ? = = 64a? £164 is 
even; and if p = 8a +3, then pal a Sao? Abas is odd. 


The following computational lemma will calculate (<) in a different 
Pp 


way. The proof is tricky. 
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3.6.7. Lemma. Let p be an odd prime and a be relatively prime to p. Let 
0<1r; <p be such that ai =r; (mod p) forl1<i< a Let 


p-1 
n=|{i1sis "5 andr; > 5M and vey | 
Then 
a n 
© iy) 
Furthermore, N = n+ (a ie (mod 2). In particular, if a is odd, then 
a 
ae 


Proof. Let 61,...,0m be the r; < i and let c1,...,€n be the r; > f. Then 
mtn= 2. Observe that ifl<i< js, 


rtr; =a(Ztitj)=0 (mod p) — i+7=0 (mod p) = i=jp. 


Therefore 61,...,60m,P— C1,.--,) — Cn are distinct. Since m+n = bot and 


the b; and p — c; all lie between 1 and po we see that 


fits oten igs B= tesa sy ey SA ps vege 


Thus, 
(2+)! — [lo [[@ —c)= (-1)"[[ & II Cj 
i=1 j=1 i=1 j=l 
(p-1)/2 
= (-1)” (0):= (—1)a?-Y)/2 (Bot) (mod p) 
41, 
and hence 


We now turn to the computation of N. First observe that ia = p| | +7;j. 
Thus, we have 
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On the other hand, working mod 2, we have 


(p-1)/2 m n 
Ss" r= > +S) ((p—cy) -D) 
i=l i=l j=l 
(p—1)/2 2 


p 
8 


=—np+ » i=—np+ (mod 2). 
i=1 


Therefore, the two quantities just computed are equal modulo 2; whence 


N-n=(N-n)p=(a = (mod 2). 


When a is odd, N = n (mod 2); and so (5) = (-1)%. a 


3.6.8. Theorem. Let p be an odd prime. Then 
(?) pena 1 ifp=+1 (mod 8) 


= (=I) aia —] ifp=+3 (mod 8). 


Proof. By LemmaJ3.6.7| we must count the number of elements n in the set 
{2,4,6,...,p—1} which are greater than 5. If p= 3 (mod 4), then smallest 


such even integer is aa and if p = 1 (mod 4), then smallest such element 
. p+3 
18 7? 


We first consider the case p= 1 (mod 4). Then 


p-1 ets p—-1__J0 (mod 2) ifp=1 (mod 8) 
4 — }1 (mod 2) ifp=5 (mod 8). 


Similarly, when p = 3 (mod 4), 


p-1-2 pte (mod 2) ifp=3 (mod 8) 
2 4 


" 0 (mod 2) ifp=7 (mod 8). 


2 2 
Therefore (5) = 1 if p= +1 (mod 8) and (5) = —1 if p= +3 (mod 8). 


Thus, (-) = (-1)-D/8, : 


We are now ready to prove the Law of Quadratic Reciprocity. 


Proof of Theorem [3.6.6}| The first statement was established above in 
Theorem 8.6.8] For the second statement, let 
pol q-1 
4 2 : 
aq JP 
N= |= | and M= ||. 
5 >: |2 


i=l j=l 
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Since p and q are odd, Lemma[B.6.7] shows that 
P\ (4 | 
(8) =o 
q’ \p 


R={(c,y)e Bisa, sy 3}. 


Consider the rectangle 


Notice that 


IR] = 2 | Ei _p-1q-1_ @—-1)@—-)) 

241 2 2 2 4 
By counting | R| in a different way, we will see it is also equal to the quantity 
M +N. Consider the line L C R? defined by the equation y = pe Since p 
and q are distinct primes, we see that LO. R = @. Divide R into two triangles 


T= {(zy)eRiy< 5 and T) = {(z,y) e Rin < oy} 


Then |R| = |T;| +|Ts|. For each 1 <i < %, 


(Ga) 1sys $}nt|=|vezi1sy<4}/=|4]. 


Hence 


Similarly, for each 1 < j < &+ 
p ip ip 
pep Ses 502) She eelees > bea 
Thus, |T2| = M. Therefore N+ M = |R| = eG) and so 


p qd (p—1)(q-1) 
(2)(2) = ayer ye 
This finishes the proof. a 


Exercises 


Determine if 107 is a quadratic residue modulo 1009. 
Determine if 20964 is a quadratic residue modulo 1987. 
Find all solutions to the equation x° = 29 (mod 61). 


SS 


Without using Theorem [3.6.6] show that for every prime p, at least one 
of —1, 2 and —2 is a square modulo p. 


5. Let p be a prime and let a,b € N with p not dividing a or b. Show that 
exactly 1 or 3 of a,b, ab are squares modulo p. 
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6. For prime p > 7, show that there are always two consecutive quadratic 
residues mod p neither of which is zero. 


7. Let p be a prime in Z and suppose 5 is not prime in Z[,/p]. Prove that 
p=5 or p=+l1 (mod 5). 


3 
8. (a) If p is an odd prime, show that (<) = 
Pp 


1 ifp=+l1 (mod 12) 
—1 ifp=+5 (mod 12) 


(b) Find a similar formula for (°). 
Pp 
9. (a) Let p be an odd prime. Consider the equation 
ax? +br+c=0 (mod p) 


where p does not divide a. Let d = b? — 4ac and y = 2ax +b. Re- 
duce this equation to y? = d (mod p) and hence obtain a quadratic 
formula modulo p. 

(b) Find a necessary and sufficient condition for this quadratic to have 
a root when p = 2. 


Notes on Chapter 3 


Diophantine equations are named after the Greek mathematician Diophan- 
tus of the 3rd century CE. Linear Diophantine equations were discussed in 
the notes in Chapter 2. 

Much earlier, in the 6th century BCE, Pythagorus gave examples of 
integral Pythagorean triples and produced an infinite family. Later Plato 
found another non-trivial infinite family. Independently the Hindu scholars 
also found similar families. Around 300 BCE, Euclid gave more general 
families of solutions in his Elements. Many schools of mathematics around 
the world eventually solved this problem. 

Fermat studied ways of representing numbers as sums of squares, cubes, 
etc. He showed that a prime p = 1 (mod 4) is a sum of two squares in a 
unique way. He also knew that if n has a prime factor p = 3 (mod 4) to an 
odd power, then it is not a sum of two squares. The final form was due to 
Euler. 

Fermat wrote about his equation x” + y” = z” in his notes. However in 
communications with others, he did not claim a solution. He did show the 
impossibility of 2+ + y+ = z?. He may also have had a solution for n = 3, 
since he challenged other mathematicians to solve it, although there is no 
record of his solution. Euler solved n = 3. Legendre and Dirichlet solved 
n = 5. The case n = 7 was due to Lamé, and was simplified by Lebesgue. 
The first significant general theorem was due to Sophie Germain. Kummer 
developed ideas of modern ring theory in order to analyze the failure of 
unique factorization in various number domains. He used this to provide a 
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proof for all regular primes. The smallest cases remaining open after that 
were 37 and 59. 

Mordell made a conjecture in the 1920’s which, if true, would imply that 
equations like Fermat’s for n > 3 could have at most finitely many solutions. 
This was proved by Faltings in 1983, and he received the Fields medal for this 
work. By 1993, computers had been used to show that Fermat’s equation 
had no solutions for n < 4000000. In 1955, Shimura and Taniyama proposed 
a conjecture concerning elliptic curves and modular forms. It was later 
shown by work of Ribet that this conjecture implies the truth of Fermat’s 
claim. In 1993, Wiles announced a solution to a major case of this conjecture 
which implied Fermat’s last theorem. It turned out that there was a non- 
trivial gap which was later fixed by Wiles and Taylor. Breuil, Conrad, 
Diamond and Taylor proved the full Shimura—Taniyama Conjecture in 2001. 

Pell’s equation also has a long history back to antiquity. Bhaskara gave 
a general method for solution in 1150. He explicitly solved x? — 61ly? = 1, 
giving the smallest solution x = 1766319049 and y = 226153980. La- 
grange proved that Bhaskara’s method worked in 1738. He later developed 
a complete solution using continued fractions. Fermat found a solution for 
d < 150 and challenged British mathematicians to solve the cases d = 151 or 
d = 313. This was done by Broukner. However Euler mistakenly attributed 
this to Pell, and his name has stuck in spite of it being incorrect. 

The quadratic reciprocity laws were conjectured by Euler and Legendre. 
Legendre made substantial progress on the problem and introduced the Le- 


gendre symbol (4). Gauss published a complete solution in his treatise 


Disquisitiones Arithmeticae in 1798. 

See |9} Vol.II] for an extensive history of these problems, or consult the 
books by Cooke [8] and Kleiner [20]. See the article [22] for more informa- 
tion about the work of Sophie Germain. See Ribenboim’s book Fermat’s last 
theorem for amateurs for more information on Fermat’s last theorem. 
See Stark for the solution to Pell’s equation using continued fractions. 
Hardy and Wright also contains much historic information, as well as 
the mathematics including a proof of the law of quadratic reciprocity. 


Chapter 4 


Codes and Factoring 


In this chapter, we will look at a code based on the number theoretic prop- 
erties that we have developed. Since this code depends on the fact that it 
is a lot easier to find big primes than to factor large numbers, we will also 
study how this is done. 


4.1. Codes 


Codes are a way of encrypting a message so that it is very difficult or 
impossible to read the message unless you have knowledge of the key which 
unlocks the message. The most familiar codes are simple substitution codes. 
This means that each letter is replaced with another one. For example, 
consider the permutation of the alphabet given by 


0123456789ABCDEFGHI JKLMNOPQRSTUVWXYZ 
593607 1842KSRIUFHQPOWELJGYTADZMVXNBC 


A message like ‘Houston airport, noon, Jan 22’ would become 
‘QGMDZGJKPAYGAZJIGGJOKJ33’. 


This kind of code is very easy to break with the aid of a computer. In fact, 
with a longer message, it can be done by hand and is a popular pastime for 
many people. For example, this message has 5 G’s, so one might think this 
is a vowel. 

Actually, computers routinely use codes all the time—not for secrecy, 
but because computers can only store numbers (base 2). The ASCII code 
provides a number from 0 to 255 for all digits, upper and lower case letters, 
and many other symbols. This is how the computer can store text, and 
how word processors can manipulate it. The modern UTF-8 system extends 
ASCII and encodes over 1,000,000 characters containing all major alphabets 
in the world. A character uses up to 4 bytes (32 bits), so there are 2°? 
possibilities. Since there is extra room in this system, certain bits are used 
to detect and possibly correct errors. This is another important use of 
encryption that helps ensure accurate transmission of digital data. 
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It is much more difficult to break the code known as a ‘one time pad’. 
The idea is to code your message by using another message known to the 
encoder and the intended recipient. First we need a simple way to combine 
two letters into one. Let us use a 36 letter alphabet consisting of 26 letters 
and 10 decimal digits. We can think of each letter as representing an element 
in Z36- That is. 


ABCDEFGHI 
101112131415161718 


JKLMNOPQRS TUVWXYZ 
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 3435 


Then two letters can be combined by addition modulo 36. Let us use the 
message ‘The quick brown fox jumped over the lazy dog’ to encode our 
message ’Pizza 5:30 tonight’. Consider 


PIZZAS530TONIGHT 
THEQUICKBROWNFQO 
IZDP4NFK4FBE364H84H 


To decode this, one needs to know the coding message. If this message is 
changed every time, for example using different pages of War and Peace 
each time, this is virtually impossible to break without stealing the code. 
However, if both the sender and the recipient always have their copy of War 
and Peace with them, it might be a giveaway. 

One thing these two codes have in common with each other and most 
other codes is that encoding and decoding use the same information. An- 
other kind of code has been invented which is of quite a different character. 
Known as public key cryptography, the interesting thing about these codes 
is that the method for encryption can be made public. For example, the 
code can be published in the New York Times or be listed on an electronic 
bulletin board. Anyone can send you a coded message. The important point 
is that knowing how to encode does not tell you how to decode! 


4.2. The Rivest-Shamir-Adelman Scheme 


The public key code that we will study was developed at MIT by Rivest, 
Shamir and Adelman. The key point that makes their code secure is that 
it is very easy to find large primes (say 200-300 decimal digits), but very 
difficult to factor large numbers that have a small number of large prime 
factors. The reason for this will be discussed in the next section. 

Here is how it works. Pick two large primes p and q, with 200-300 
digits. Set n = pq, and notice that y(n) = (p—1)(q—1). Now pick another 
number r (say 6-10 digits) which is relatively prime to y(n). Publish the 
two numbers (n, 7). 

Anyone who wishes to send you a message does the following. First turn 
your message into a number M by some standard scheme such as ASCII 
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or any simple scheme that encodes the 36 characters 0-9 and A-Z as a two 
digit number, possibly also including a—z and some punctuation marks. If 
necessary, split your message into blocks so the numbers encoded are all less 
than n. The coded message is 


C=M" (mod n). 


This message can now be published in the personals section of the New York 
Times, or posted somewhere online. 

The presumption is that all interested parties know the method of en- 
coding and the message sent. Nevertheless, it is secure! Only you can break 
the code. To do this, you must know p and gq. And you must know the 
Chinese Remainder Theorem. First, solve the equation 


rs=1 (mod y(n)). 


This has a unique solution by Lemma [2.3.3] Of course, to find s it is nec- 
essary to know y(n), and to find it, one must factor n. The key is Euler’s 
Theorem, which tells us that M¥(") = 1 (mod n) when gcd(M,n) = 1. In 
fact, since n is the product of two distinct primes, it turns out that our de- 
coding method works for every M in the interval 0 < M <n. Since rs = 1 
(mod y(n)), it can be written as rs = 1+ ky(n). Now compute C* (mod n) 
using the Chinese Remainder Theorem. 


C® = M™ = M)1(M?-1)F9-) = M (mod p) 
C= M™= M1 (Mo 1)F@-1) = M (mod q) 


By the Chinese Remainder Theorem, C* = M (mod n). Of course this only 
finds M up to a multiple of n. That is why we begin with a message such 
that 0< M <n. 

If you have access to a symbolic manipulation software, try to design 
your own codes. Exchange messages with a friend, and decode them. Try 
using these same programs to break your friend’s code. 

The message is as secure as the difficulty of factoring large numbers 
(not practical) and the security of the storage location of the key s. (It 
is not necessary to remember p and q.) The latter consideration does not 
have anything to do with coding though. Of course, if everyone knows your 
encoding procedure, what prevents them from sending you a message and 
signing another name? How can you be sure that the message is really from 
your friend? The trick is for the sender to use his own code to give the 
message a signature. 

It works like this: suppose your code is (n,r) and the sender has a 
published code (N, R). Let us also suppose that N <n. Only the sender 
knows the decoding key S' for the (NV, R) code. The sender computes 


Q=M* (mod N) and 0<Q<N. 
Then this is encoded by the (n,7r) code by 
C=Q" (mod n). 
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Again C is sent. To decode, you compute 
Q=C* (mod n). 


Fortunately, since N <n, we know that 0 < Q < n without any ambiguity. 
Now using the sender’s published code, compute 


M=Q® (mod N). 


This message must be from our friend because only he/she knows S$ which 
enabled the encoding in the first place. 

What happens if N > n? Try it out on a computer. You will end up 
with garbage. In this case you must encode using n first: 


Q=M" (mod n) 
C=Q° (mod N). 


This is decoded in the same basic way. 

We end this section by discussing two practical aspects of the Rivest- 
Shamir-Adelman scheme. First, the encryption scheme involves raising M 
to a potentially large power r. If one computes M” by naively multiplying 
M with itself r times, this requires r computations, which is a large number. 
Instead, the way one performs this computation in practice is expand r in 
base 2, namely write 


P= ag 2a tdaa i124 Oa 


where each a; € {0,1}. Then, by repeatedly squaring, one computes M, 
M2, M* = (M?)?, M8 = (M*)?, .... M?" = (M?""')? mod n. One then 
computes the product 


M" = M®(M?)™ ...(M?")% (mod n). 


In total, this requires very few computations. To obtain all powers M?" 
for r < s involves taking k — 1 products, and then obtaining M” involves 
ag +-+:++a,—1<k products. Thus, this is on the order of 2k © 2log,(r) 
computations, which is substantially faster than performing r computations. 

Second, the Rivest-Shamir-Adelman scheme relies on choosing n = pq 
with p and q large primes, which raises the question of how one obtains 
large primes in practice. At the beginning of Section [1.4] we mentioned the 
famous Fae Number Theorem, which asserts that for large N, there are 


roughly Toa) prime numbers less than or equal to N. Said differently, if 


we fix a large number JN, the probability that a randomly chosen number 
m € {1,...,N} is prime is roughly oatNY" Therefore, if we choose log(N) 
numbers in {1,...,.N}, there is a good chance that at least one of them is 
prime. The chances are much higher if you avoid multiples of small primes. 
In practice, one can test primality using the deterministic algorithm by 
Agrawal-Kayal-Saxena developed in 2004 or the older Miller-Rabin proba- 


bilistic test. Notice that even if N is a large number with 500 digits, log( NV) 
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is only about 1000, so finding large primes with this method is quite practical 
for a computer. 


Exercises 


1. Show that s works as a key to decode the RSA encoded message provided 
that 
rs=1 (mod lem(p—1,q-1)). 


2. Use computer software to check that r = 42385687 and a number 2 lines 
long: 
n = 9187532068491850238012987000740627489892542940\ 
1183797214111268335816454459464037326759995364752417 


has a decoder 


s = 5697037877032797156343521223137628208530547872\ 
5834255953360930453245246857516891597701705638306003 


3. Use computer software to choose two primes p,q with 40-45 decimal 
digits, and construct an RSA code. 


4. Exchange messages with a friend. Code your student id number or a 
simple message with your code ‘signature’, then code it up using your 
friend’s code. 


5. Try to break your friend’s code. 


4.3. Primality Testing 


How do you tell if a large number is composite or prime without doing a lot 
of trial divisions? It turns out that you may be able to show that a number 
is composite without knowing any factors! In 2004, Agrawal-Kayal-Saxena 
gave a groundbreaking efficient algorithm to determine if a number is prime. 
Their algorithm uses the fact that ifn > 2 and gced(a,n) = 1, then n is prime 
if and only if (X +a)" = X”" +a” (mod n), see Exercise [I] Checking this 
particular congruence is not efficient. However they modify it in a way that 
makes the problem tractable: if (X +a)" = X"+ a” (mod n) holds, then it 
is also true that for all r, there are polynomials f(X), g(X) such that 


(4.3.1) (X +a)” = (X" +a") + (X" — 1)g(x) + nf(X). 


Indeed, we can simply take g = 0 and f an appropriate polynomial. Agrawal- 
Kayal-Saxena show a converse result: they prove that if there exists r and a 
set of a such that if holds for some f and g, then n is a prime power. 
Moreover, they make these choices in such a way that this equation can be 
checked efficiently. Although the details of their algorithm are beyond the 
scope of this course, in this section we highlight some other methods to test 
primality. 
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One guaranteed way to test if a number p is prime is based on the results 
of Section 2.10] We showed in Theorem 2.10.6] that if p is prime, then there 
is a number a such that the set of powers {a,a?,a?,...,a?~'} modulo p is 
a permutation of the list {1,2,3,...,p — 1}. Conversely, the existence of 
such an element guarantees that there are p—1 different numbers relatively 
prime to p. This means that p is prime. Indeed, there are y(p — 1) such 
generators. So chances of finding one by trial and error are quite good. The 
problem, however, with this test is that if p is large, it is time-intensive to 
compute all powers of a number a. In this section, we discuss more efficient 
algorithms to test primality. 

Like the Rivest-Shamir-Adelman code, the key to a more efficient algo- 
rithm comes from Fermat’s Theorem. Let us suppose that a large number 
n is given. We know that if n is prime, then a”~! = 1 (mod n). So if 
a”! #1 (mod n) for some a, then n is definitely composite. For example, 
if n = 2096004487, we can compute 2”~' = 1992692247 (mod n). Hence n 
is composite. This does not tell much about how to factor it however. 

There are some composite numbers which pass this test for all choices 
of a which are relatively prime to n. Such numbers are called Carmichael 
numbers. They are much less common than primes. For example, Erdés 
showed that the sum of their reciprocals converges. However, recent results 
have shown that they are nevertheless quite plentiful. An example is n = 
561 = (3)(11)(17). Notice that if gcd(a, 561) = 1, 


a9 = (a?)?89 = 1 (mod 3) 
a = (a!?)88 = 1 (mod 11) 
a = (a!6)85 = 1 (mod 17) 


By the Chinese Remainder Theorem, one sees that a°°° = 1 (mod 561) for 
all a relatively prime to 561. In fact, a®° would suffice. 

Still, without any additional computation, it is possible to improve this 
test. In our example, 560 = 16(35). Consider the computations 


ar = 263 (mod 561) 
270 = (235)? = 166 (mod 561) 
gi40 — (270)? = 67 (mod 561) 
2280 — (2140)? = 1 (mod 561) 
2560 — (9280)? = 1 (mod 561) 


We see from this sequence of equations that 67 is a square root of 1 modulo 
561. If 561 were a prime, there would be only two square roots, namely 
+1. So this shows conclusively that 561 is composite. In this case however, 
information about the factors is revealed because 


0 = 677 — 1 = (67—1)(67+1) (mod 561). 


So gcd(66, 561) = 33 and gced(68, 561) = 17 are factors of 561. 

The general procedure, known as the Miller-Rabin test, uses this ap- 
proach. Moreover, it does not involve any more computation than it requires 
to get a”! (mod n). Pull out all factors of 2 from n— 1, say n— 1 = 24m. 
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Now compute a” (mod n), and then successively square it to compute a?” 
(mod n), a*” (mod n), a2” (mod n),..., a! (mod n). If 1 does not oc- 
cur in this list, then n fails our earlier primality test. However, if a’~! = 1 
(mod n) and a™ # 1 (mod n), then there is a last congruence in this list 
which is not 1. This will be a square root of 1 in Z,. If it is not —1, then n is 
definitely composite. This is because x? = 1 has only the solutions x = +1 
in a field, but can have more solutions when n is composite. 

It is also easy to check whether n has any small prime factors. The 
computer can store the product P of all primes less than 1000. Compute 
gcd(n, P). If this is not 1, then n is composite. The composite numbers 
which pass the Miller-Rabin test for half a dozen random choices of a are 
quite rare. Indeed, a large number n which passes this test and has no 
small prime factors is almost surely prime. Such numbers have been called 
industrial grade primes. They are likely to be very hard to factor. 


Exercises 


1. Recall that Ss = CEE is a positive integer. Prove that ifn > 2, then 
n is prime if and only if (2) = 0 (mod n) for alll <k <n-—1. This 
result plays an important role in the Agrawal-Kayal-Saxena algorithm. 


2. Show that 3053 is not prime by finding a congruence identity that con- 
tradicts primality. Do not factor it. 


3. Show that 3876721 is not prime by finding a congruence identity that 
contradicts primality. Do not factor it. 


4. Show that 1729 is a Carmichael number. Find a congruence identity that 
proves that n is not prime. (A factorization of n is not a satisfactory 
substitute. ) 


5. Show that 5755495201 is a Carmichael number. Find a congruence iden- 
tity which proves that n is not prime. (A factorization of n is not a 
satisfactory substitute.) You may use computer software. 


6. Show that if p = 6k +1, q = 12k+4+ 1, and r = 18k +1 are all prime, 
then pqr is a Carmichael number. 


7. Korselt showed that a composite integer n is a Carmichael number if and 
only if it is square free and for every prime p|n, one has (p — 1)|(n — 1). 
Prove that if n has this form, then it is a Carmichael number. 


4.4. Factoring Algorithms 


If you wish to factor a large number using a computer, there are various 
tricks you can try. No method known today can factor the product of two 
primes with 200-300 digits before the end of the universe. Nevertheless, 


92 4. CODES AND FACTORING 


methods and computers will continue to improve. However, experts feel 
that it will always be significantly easier to find large primes than to factor 
the product of two of them. So the security of our code is guaranteed if we 
make our primes stay ahead of the factoring game. 

However, most random numbers have small prime factors as well as large 
ones. Any sensible factoring algorithm starts by taking the gcd of n with 
the product of the first few primes. In this way, all factors less than, say 
1000, may be pulled out. Then test what is left to see if it is composite. If 
it seems prime, it almost surely is. So now you try to prove that it is prime. 
If it is composite, the hard work begins. Unfortunately, it is known that on 
average, numbers do not have very many factors (relative to their size). So 
most factors are very big. 

Most factoring schemes use quite sophisticated mathematics. Here is an 
elementary idea that goes back to Lagrange. The idea is simple: try to find 
non-trivial solutions of 

a?=y* (mod n). 


By non-trivial solution, we mean x # +y (mod n). If n is composite, say 
n = ab, then the solutions of 

xr-y a (mod n) 

aty b= (mod n) 
provide non-trivial solutions. Conversely, if « and y form a non-trivial solu- 
tion, then gcd(x + y,n) yield proper factors of n. 

Lenstra and Pommerance have added some important new ideas to this 
method. They hope that it will prove to be a better method than others 
presently known. Their plan is to look for solutions of x = y (mod n) so 
that x and y are both products of only small primes. If enough solutions 
are found, they can be used to construct a solution of 2? = y? (mod n). Let 
us illustrate this with a small example. 

Let n = 493. A few trials yield the following equivalences (mod 493). 


—3 = 490 = 2:5-7? 
7 = 500 = 22.58 
30 = B05. SB bee 7 
—7 = 486 = 2-3 


All of these equations contain only powers of —1, 2, 3, 5, and 7. Using 
only the total parity of the exponents, we can represent these equations by 
vectors. For example, the first equation has one — sign, and odd powers 
of 2, 3, and 5; but an even power of 7. This yields the vector (1,1,1,1,0). 
Altogether we obtain 


(1, 1,.1,0;1) 
In order to get squares, we wish to combine them so that all the parities are 
even. Combining the first, second and fourth achieves this. Multiplying the 
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three equations together yields 
Bo Sea rar. 


Cancellation yields 1 = 243454. This provides a solution to 2? = y? with 
x =1and y = 900. Computing gcd(493, 901) = 17 and ged(493, 899) = 29 
provides a complete factorization. 


Exercises 


1. Use computer software to find enough congruences to factor 1643 by the 
method described in this section. 


2. (Factoring algorithms and primality testing) Using computer soft- 
ware commands for the gcd and mod, but not a complete factoring 
command, interactively factor n = 21760197701640956578295160, and 
report on the steps as you go along. 

(i) Test for prime factors up to 1000 and factor them out. Let the large 
factor remaining be called m. 
(ii) Compute 3’"—! (mod m). What does this tell you? 
(iii)One must use a brute force method to factor m. You may use that 
999983 is a factor. Let the other factor be called q. 
(iv)Repeat (i) for g— 1. This yields a prime factorization. Why? 
(v)Prove that q is prime by finding a primitive root, say r. 


Notes on Chapter 4 


The use of codes for the purpose of secure transfer of information has a 
long history. The primary uses were for military and political purposes, at 
least initially, as these parties had great resources. During World War II, 
codemaking and codebreaking were crucial parts of the war effort. This was 
the beginning of the use of calculating machines, and led to the computer 
revolution. The advent of computers has made the need for security in the 
transmission of messages something that is of importance to all of us. 
Computers also provided the means to use more sophisticated methods 
both for encryption and the breaking of these codes. A central issue was 
always how two parties could share information about a code that was safe 
from prying eyes. A major breakthrough was made by Diffie, Hellman and 
Merkle which allowed a public exchange between two parties to agree 
on a common key without revealing it to any eavesdropper. Diffie proposed 
that one could develop an asymetrical code with a public key for encryption 
that only the constructor could decode. This was accomplished by Rivest, 
Shamir and Adleman at MIT in 1978. It has since come out that the 
codebreaking division of GCHQ, the British signals intelligence agency, came 
up with a similar method to that of Diffie-Hellman-Merkle almost a decade 
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earlier, but it was kept secret until recently. Since then, other methods have 
been developed for public key codes. 

Simon Singh’s book is an interesting, non-technical introduction to 
codes and codebreaking. 

The use of codes to allow for accurate transmission over noisy signals 
goes back to work of Hamming in 1950. Nowadays, when large data files 
such as computer operating systems and other software are routinely down- 
loaded over the internet, the accuracy of transmission becomes as important 
as security. 

Computing the list of prime numbers goes back to ancient times. How- 
ever the testing of large integers to decide if they are prime, primality testing, 
is a modern idea relying on computers. The Miller-Rabin test dates 
from the mid-1970’s. The first definitive algorithm to test for primality is 
due to Agrawal, Kayal and Saxena [1]. Charmichael numbers were intro- 
duced by Charmichael in 1910. There are infinitely many such numbers [3], 
but the sum of their reciprocals is finite ; so they are rare compared to 
prime numbers. 

Factoring of composite numbers is considered to be very difficult, which 
is why the RSA scheme is thought to be secure. The modification of La- 
grange’s ideas from Section |4.4]is due to Lenstra and Pomerance [23]. The 
possibility of quantum computers and an algorithm of Peter Shor would 
make factoring practical, and would break the RSA code. Other algorithms 
for encryption that are secure against quantum computation have been de- 
veloped, but are not yet in widespread use. 


Chapter 5 


Real and Complex Numbers 


In this chapter, we will learn about the fields of real and complex numbers. 
In particular, we will prove the famous Fundamental Theorem of Algebra 
which asserts that every polynomial with complex coefficients factors into a 
product of linear terms. 


5.1. Real Numbers 


We learn in calculus that the rational numbers are not sufficient for the study 
of functions. For example, a nice function like x? — 2 does not have any zeros 
if it is only defined on the rationals. Nor, from the point of view of algebra, 
are the rationals adequate because this polynomial does not factor. The 
‘natural’ domain of this function should include /2. Similarly, the function 
x* — 8x does not achieve its minimum value at any rational number. It also 
turns out that the study of simple differential equations like y’ = y leads to 
the solution f(x) = e” where e is an even stranger ‘number’. Similarly, the 


integral 
| 
| — dt = In(z) 
1, t 


introduces another trancendental function, meaning a function that does not 
satisfy an algebraic equation. Of course, you have already learned about the 
trigonometric functions sin(x), cos(#), and so on which rely on the magic 
number 7. So for various reasons, we find that the rational numbers are 
inadequate for the analysis of functions. 

The answer is to allow these other ‘numbers’ which seem called for to 
fill the gaps between the rationals. There are a number of ways to define 
what these real numbers R should be. One of the simplest descriptions is 
to make use of the decimal system. We describe the set of real numbers as 
all possible infinite decimal expansions: 


T= ApAp—1.--.a1{a9.€_-1a_2Qa_3... 


where a; belong to the set {0,1,2,3,4,5,6,7,8,9}. Such expansions are 
already familiar for rational numbers such as 5 = 0.33333... and 2 = 
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3.14285714.... Every such expansion gives us a real number. One problem 
with this definition is that different decimal expansions may yield the same 
number. For example, 

1.000...=0.999.... 


This is a fairly minor problem, but you have to deal with this ambiguity of 
names whenever you talk about the operations of addition, multiplication, 
inverses, and even equality. A more important problem with this definition 
is that it assumes implicitly that all of these symbols represent a number 
and that we can define addition and multiplication. If we consider them 
as infinite series, then that helps define these operations as limits, but the 
whole notion of limits creates new issues. 

The discovery of the nature of the real numbers was intimately con- 
nected with the search for a good understanding of convergence and of the 
nature of sets. All these notions were formalized in the middle of the nine- 
teenth century. See Manheim for a history of topology. There were two 
different approaches. 

Bolzano and later Cauchy introduced the notion of a Cauchy sequence, 
which is the criterion used to decide if a sequence is convergent without the 
need for any mention of the limit point itself. So one can consider the set 
of all possible ‘limits’ of sequences of rational numbers. For example, the 
sequence €n = ) po i: can be shown to converge very rapidly to the real 
number e. Even though e does not belong to the rational numbers, we can 
manipulate it in the same way by using these rational approximations. We 
are able to extend the notion of addition and multiplication because these 
operations are continuous. 

We have omitted an important part of the definition. For each real 
number, there are many different convergent sequences with it as a limit. 
So in reality, one must put an equivalence relation on the set of Cauchy 
sequences. Two sequences are equivalent if they have the same limit. But 
since all this must be done without reference to a limit point, say that two 
Cauchy sequences {u,,} and {v,} are equivalent if the sequence uy, v1, U2, 
vg,... 18 also a Cauchy sequence. Alternatively, two Cauchy sequences are 
equivalent if Jim, Un — Un = 0. The real numbers are the set of equivalence 


classes of Cauchy sequences of rational numbers. Each rational number r 
corresponds to the class containing the constant sequence {r, r, r,...}. 

Another solution was proposed by Dedekind. He suggested a more al- 
gebraic approach. Consider all proper subsets A C Q such that A has no 
largest element and if a € A and b < a, then b € A. These objects are called 
Dedekind ‘cuts’, because they correspond to cutting the rational numbers 
into two at some ‘real’ point. For example, 


A={réQ:r<0 or r* <2} 
represents \/2. Addition can be defined on the sets themselves: 
A+B:={r+s:reA, se B}. 
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Multiplication requires a little more ingenuity (try to define it), but is done 
in a similar manner. One may then verify all the axioms of a field. 

We will not carry out the construction of the real numbers in this book. 
This discussion is for the purpose of making you aware that there were some 
significant problems involved in the definition of the real numbers that took 
many years to resolve. It took about 2000 years, from the Pythagorean 
school to the middle 1800’s, to realize that one needed an abstract, non- 
geometric, definition of real numbers. 

The real numbers have an important completeness property. This prop- 
erty was known before the real numbers were properly defined. Indeed, it 
was the realization that one needed to prove this completeness property that 
led to the more modern approach to mathematical proof. One way of stating 
this property is known as the: 


5.1.1 Least upper bound property. Jf X is a non-empty set of real 
numbers with an upper bound, then there is a least upper bound s. That is, 
everya € X satisfies a < s; and if everya€ X satisfiesa <t, then s <t. 


Let us look at how one can prove this using Dedekind’s definition. Each 
x € X corresponds to a cut Ay. Define S = U,cx Ax. Let us now verify 
that S is a cut which represents the least upper bound s of X. Note that S 
is a proper subset of Q because it has an upper bound. If a € S, then there 
is some tp € X so that a € A,,. Hence any b < a belongs to A;, and thus 
to S; and there is some c € Az, C S with a < c. Thus S is a cut. Now S 
is an upper bound for X because $ contains A, for each x € X, and thus 
x < s for all x € X. On the other hand, if x < ¢ for all x € X and t is 
represented by a cut TJ, then T must contain A, for every x € X. Hence 
S CT, and therefore s < t. 

The least upper bound property can be used to prove other basic prop- 
erties of the real numbers. For example, the Intermediate Value Theorem 
and the fact that every Cauchy sequence of real numbers converges to a 
real number. This latter property is known as completeness. Other well 
known theorems such as the Heine-Borel Theorem and the Extreme Value 
Theorem depend crucially on this completeness property. We will require 
the Extreme Value Theorem in our proof of the Fundamental Theorem of 
Algebra. This is usually proven in a course on calculus or real analysis. 


Exercises 


1. Define multiplication using Dedekind’s definition of the real numbers. 
HINT: do it first for two positive numbers. 


2. Show that the associative law for addition holds in R using Dedekind 
cuts. 
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Prove the Intermediate Value Theorem: If f is a continuous function on 
[0,1] such that f(0) < 0 and f(1) > 0, then there is a real number s 
such that f(s) = 0. 

HINT: use the {x : f(a) < 0} to help define a Dedekind cut. 


Prove that every polynomial of odd degree with real coefficients has a 
real root. 


As discussed in this section, R is constructed from Q by taking limits 
with respect to the absolute value. In this exercise we discuss another 
type of absolute value one may construct on Q that depends on a choice 
of prime p. One can also take limits of rational numbers with respect to 
this so-called p-adic norm and the result is a field known as the p-adic 
numbers. 


Let p be a prime. Let |0|p = 0. For any 0 4 a € Z, let |a|, = p~*, 
where a = p*u with k > 0 and u € Z is relatively prime to p. For any 
0Aq€Q, write ¢ = $ with a,b € Z non-zero and gcd(a,b) = 1. Then 
let [dlp = lalplOlp?. 

(a) Prove that for all g,r € Q we have |qr|, = |q|p|r|p and 


3 | 


at+r|p < max{|qp, |rlp} < lelp + Irlp- 


Show that |q + 1p = max{|q|p, |r|p} if lalp A Irlp- 
(b) Prove that the following series converges with respect to | - |» 


= 1 
Wet) xs 
d? a 


. . mioonl _ 
i.e., show that im Jl +(p—1) or op > = 0. 


5.2. Complex Numbers 


From the point of view of algebra, the real numbers still are deficient. One 
would like to be able to completely factor all polynomials. But a polynomial 
like x? + 1 has no real roots. The solution is to invent a root which we call 
i. In other words, one constructs a larger number system which contains an 
element i such that i? = —1. Nothing prevents us from introducing such a 
symbol as long as we verify that our new system makes sense. 


Define the set of complex numbers C to be the collection of all ‘num- 


bers’ of the form a+ ib where a and 6 are real. It is often convenient to 
associate the number a +ib with the vector (a,b) in the plane R?. Addition 
is defined by vector addition: 


(a + ib) + (c+ id) = (a+c)+i(b4+d). 
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Multiplication is defined by extending real multiplication using the distribu- 
tive law and the identity i? = —1. This forces the rule: 


(a + bi)(c + di) = ac + iad + ibe + i7bd = (ac — bd) + i(ad + be). 


5.2.1. Theorem. The complex numbers form a field. That is, addition is 
commutative and associative, has the zero element 0 =0+ 10, and additive 
inverses —(a + ib) = (—a) + i(—b). Multiplication is commutative and as- 
sociative and distributes over addition, has the identity element 1 = 1+ 20, 
and non-zero elements have (multiplicative) inverses. 


The proof of this theorem will not be written out in detail. A few 
comments will suffice here. The interested reader can carry out the rest of 
the argument. First, the properties of addition are valid because they are 
valid for vector addition. Commutativity of multiplication follows directly 
from the definition and commutativity of real multiplication. The associative 
law is a simple computation. Distributivity is also a routine computation. 
We will carry it out in detail to give the flavour of the proofs. 

Let u=a+ib, v=c+id and w=e+if be three complex numbers. 
We have to verify the identity (u+v)w = uw + vw. Compute: 


(u+v)w = ((at+c)+i(b+d)) (e+ fi) 
= (ae+ ce — bf — df) +i(af +cf + be + de) 
= ((ae — bf) + i(af + be)) + ((ce — df) + i(cf + de)) 


= uw + vw 


The astute reader may notice that a special case of the distributive law is 
used in the proof. Multiplication by i does distribute over multiplication 
and addition of real numbers. This follows from the definition of complex 
addition and multiplication, and is not a circular proof. 

Multiplicative inverses are worth investigating more closely. First, define 
the complex conjugate of a complex number z = x + iy by 7 = x — ty. 
Notice that zz = x? + y? is a non-negative real number which is strictly 
positive except when z = 0. So it follows that 

Pay, Se r xe #Y 
LS = i : 
Be. By a nae 
This verifies all the properties of a field for C. 

Let us collect together some simple properties of the conjugate function. 

All of these properties are are left to the reader. 


5.2.2. Proposition. Complex conjugation is an involution that preserves 
all the field operations: 

(1) Involution: Z = z. 

(2) Addition: 2 F22=74+%. 

(3) Multiplication: (z1z2) = AF. 
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There is an important geometric interpretation of the quantity zz = 
x? +y?. This represents the square of the Euclidean length of the vector 
(x,y) in the plane. So one introduces the notion of absolute value or 
modulus for z = x + ty: 


lz = (22)? = Va? $y. 


We also introduce the real and imaginary parts of z = x + iy defined by 


Rez=2= 


The following proposition summarizes the basic properties of absolute value. 


5.2.3. Proposition. Let z = 1+ yi and w = u+ vi be two complex 
numbers. Then 
(1) [2] =|zI. 
) [zw] = |2| lw. 
) |z| > 0. Moreover, |z| = 0 implies that z = 0. 
) |Rez| < |z| and |Imz| < |2|. 
) (Triangle Inequality) |\w + z| < |w| + |z|. 


(2 
(3 
(4 
(5 


Proof. The proofs of (1) and (3) are routine. For (2), notice that 
|zw|? = zwrw = zZww = |z|"|w)’. 
Property (4) is immediate from 
|z\? = (Rez)? + (Imz)?. 


Finally, the most important property is the triangle inequality. This 
name comes from the fact that the vectors w, z, and w+ z form the three 
sides of a triangle. The triangle inequality states that the length of one side 
is no longer than the sum of the lengths of the other two sides. 

|w + z|? = (w+z)(W+2Z) 
= wWW+ WZ + ZW + 2Z 
= |w|? + 2Re(wz) + |z|? 
< |wl? + 2fw3| + |z|? 


= |w|? + 2|w] |z| + |2[? = (leo] + lal)’. 


Taking square roots establishes the inequality. | 


In Chapter [7] we introduce a general method for building a larger field 
in which a given irreducible polynomial has a root. In this language, we 
will see that starting with R and the polynomial x? +1, we construct a field 
isomorphic to C by adding a root i of 2? +1. 
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Exercises 


1. Prove that |w — z| > |w| —|z| for all complex numbers w and z. 


2. Show that z and z~! lie on a straight line through 0. 
Show that if w € C is a root of a polynomial p(x) with real coefficients, 
then p(w) = 0 as well. 


3. Prove that one cannot define an order < on the field of complex numbers 
(see properties [01] and [02] from Section [L.]). 


4. Show that there is no intermediate value theorem for polynomials with 
complex coefficients. 


5. (Products of sums of two squares) Use complex numbers to prove 
that if a,b,c,d € Z, then there exist x,y € Z such that 
(a? + b7)(c? + d?) = x? + y*. (Compare with Lemma[.5.51) 


—b 
under matrix addition and multiplication. Prove that the map 


6. Show that the set S of 2 x 2 matrices of the form | . form a field 


yp: CS, p(a+t ib) = S, 


is an isomorphism of fields. 


7* If you are familiar with the properties of determinants, use the represen- 
tation of the complex numbers in Exercise [6] to prove that |wz| = |w||z| 
by computing the determinants of the corresponding matrices. 


5.3. Polar Form 


Every point in the plane can be described by its Cartesian coordinates (2, y). 
It can also be described by its polar coordinates, (r,@), where r = (x? + 
y*)'/? is the length of the vector (x,y) and 0 is the (oriented) angle in radians 
between the positive real axis and the ray determined by positive multiples 
of the vector (x,y). The Cartesian coordinates are determined from the 
polar form by the equations 


x=rcos(@) and y=rsin(@). 


Conversely, the polar coordinates are obtained from the Cartesian form by 
solving these equations. Of course, the angle @ is determined only up to a 
multiple of 27. 

This can be applied to the complex plane via its identification with R?. 
The argument of a complex number z = z + ty is the angle Arg(z) = 0 
in the polar form (r,@) of the vector (x,y). Again, this argument is only 
determined as a real number modulo 27. Of course, r = |z|. Let us introduce 
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a notation which will be used only for the next two sections: 
cis(#) := cos() + isin(6). 


This complex number lies on the circle of radius 1, centre 0, known as the 
unit circle. Conversely, every point on the unit circle has this form. So every 
complex number can be represented as z = r cis(@). The significance of this 
is that the argument of a product is the sum of the arguments. 


5.3.1. Proposition. 1; cis(6;) rz cis(@2) = (rir2) cis(6, + 42). 


Proof. Calculate 
cis(01) cis(02) = (cos(@1) + isin(61))(cos(@2) + isin(62)) 
= (cos(6;) cos(@2)—sin(@1) sin(@2)) +7(cos(1) sin(62)+sin(6;) cos(42)) 
= cos(0; + 02) +7sin(A; + 62) = cis(@1 + 62). 
The fact that the absolute values multiply is a consequence of Proposition 


5.2.3] (2). = 
An immediate consequence of this is known as de Moivre’s Theorem. 
5.3.2. Corollary. (cos(#) + isin(@))” = cos(n@) + isin(n@) for n > 1. 
This formula is quite useful for calculations of certain sines and cosines. 
For example, consider this identity for n = 5. 
cos(50) + isin(50) = (cos() + isin(@))° 
= (cos’(0) — 10 cos*(8) sin?(8) + 5 cos(@) sin*(0)) 
+ i(5cos*(8) sin(@) — 10 cos*(@) sin®(#) + sin?(6)) 
By using the identity cos?(@) + sin?(@) = 1, we obtain 
sin(50) = 5 (1 — sin?(6))” sin(@) — 10 (1 — sin?(6)) sin3(0) + sin®(@) 
= 16sin°(9) — 20sin3(@) + 5 sin(6) 


In particular, apply this when @ = 7/5. Then sin(7/5) is a root of the 
polynomial equation 162° — 20x? + 52 = x(16(x”)? — 202? + 5) = 0. Since 
sin(7/5) #0, it follows that sin?(7/5) is a root of the quadratic equation 


16y? — 20y +5 =0. 


This equation has roots 5V5 To decide which root equals sin(7/5), notice 
that 0 < 2/5 < 7/4. Thus this angle lies in the first quadrant, on which 
sin(xz) is monotone increasing. So 


0 < sin(1/5) < sin(1/4) = 1/V2. 
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Clearly, 2 awe a ; < 2 es So, 
5—Vd 
sin(7/5) = a 
1 
cos(7/5) = 2 as = Eve 


We can also solve other simple polynomial equations. For example, 
consider 
Z+4=0. 
Writing z = rcis(@), the equation becomes 
r* cis(40) = —4 = 4cis(m). 
Hence r = V2 and @ is an angle such that 49 = 7 (mod 27). So 
TORT) 702. Fe 
= = k; 
4 4 a 2 
for some integer k. Only the values k = 0,1,2,3 are important, for after 
that, the values repeat modulo 27. Hence the roots are 


za = vV2cis(r/4) = 1+i 
zz = V2cis(3r/4) = -1+i 
zz = V2cis(5r/4) = -1-i 
za = W2cis(77/4) = 1-3 


Exercises 


1. Find the Cartesian form of all cube roots of 87. 
2. Find the exact values of sin(a7/12) and cos(7/12) by using the identity 
cis(/3) cis(—7/4) = cis(m/12). 


3. Find all complex roots of the polynomial z!° + z° + 1 = 0. Express at 
least one of them in Cartesian form. 


Use de Moivre’s theorem to obtain a formula for cos 4@ and sin 46. 
Find all the 6th roots of —1. Graph them on the plane. 
Calculate (1 + i)70?8. 


Ga ae ce 


Prove the quadratic formula for a quadratic with complex coefficients. 
Deduce that every quadratic in C[z] has two complex roots. 
HINT: complete the square. 


8. (a) Solve 24+ 16 =0. 
(b) Hence factor p(x) = x* +16 as a product of two real quadratic 
polynomials. 
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5.4. The Exponential Function 


In this section, we will extend the definition of the exponential function to 
all complex numbers. To do this, we will search for a differentiable function 
E:C —>C such that E(w +z) = E(w)E(<z) for all w and z in C and 
E(x) = e® for all x € R. Once we have established the existence of this 
function, we will write e* for E(z). 

Let us calculate some simple properties that such a function must have. 
First, 


E(x + iy) =e” E(iy). 
And using the differentiability, we get 
E(z+h) -— E(z) 


P(2)= Joy 
= B(2) lim —* = B(e) 


Equality from line 2 to line 3 follows because we have assumed that the first 
limit exists. 

Now concentrate on the function f(y) = E(iy). Split it into its real 
and imaginary parts as f(y) = E(iy) = A(y) +iB(y). Differentiating with 
respect to y yields 


f(y) = A'(y) + iB'(y) 
= Bi iy) SY 
= iE(iy) = —B(y) +iA(y) 


So we arrive at the system of differential equations 


A'(y) = —Bly) 
By) = Aly). 


This leads to the second order differential equation A” (y) = —A(y). From 
the identity 1 = E(0) = A(0) + iB(0), we also get the initial conditions 
A(0) = 1 and A’(0) = —B(0) = 0. From calculus, we know that this system 
has a unique solution A(y) = cos(y) and B(y) = sin(y). 

Thus we arrive at a unique solution E(iy) = cos(y) + isin(y) = cis(y). 
So 


E(a + iy) = e*(cos(y) +isin(y)) = e* cis(y). 


For this reason, we will usually write e’” = cos(@) + isin(@) instead of cis(6) 
from now on. 
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Let us verify that this function indeed has the properties that we searched 
for. 
E(x +iy)E(u + iv) = e* cis(y)e" cis(v) 
= e*  cis(y + v) = E((x + iy) + (u+iv)). 
So F satisfies the multiplicative property. 


The derivative property is a bit more delicate. The hard part is to show 
that E’(0) = 1. For then, as above, we obtain 
E'(z) = E(z)E'(0) = E(z). 
To verify that E’(0) = 1, we must show that 
lim LEQ) = LAL _ og 
h-0 |h| 


The complication comes from the fact that h takes all small complex values, 
not just real values, as it approaches 0. However, we need only facts from 
the calculus of real functions to verify this limit. The major tool for making 
estimates is the mean value theorem. Let us write h = x + iy, so that 


[Al = Va? + y?. 
We may assume that |h| < 1, so in particular, |z| < 1. Calculate 
E(h) —-1—h=e”* cos(y) + ie” sin(y) -l1—ax —iy 
= e”(cos(y) — 1) + (e*? —1— 2x) +ie*(sin(y) — y) + iy(e” — 1) 
Each of these terms can be estimated by the mean value theorem. First, 


since f(y) = cos(y) has derivative f’(y) = —sin(y), it follows that there is a 
value c between 0 and y such that 


| cos(y) — 1] = f(y) — F(0)| 
= |f'()lly| =| —sin(o)|lyl < lellyl < Iyl?. 
So e*| cos(y) — 1| < ely|? < e|h|? provided that |x| < 1. 
A similar treatment of the function e* shows that |e” — 1] < e|z| for 
|x| < 1. Now repeat the argument for the function g(x) = e” — 1— 2, which 


has derivative g'(x) = e* — 1. Again by the mean value theorem, there is a 
point c between 0 and «x so that 


le” —1—2| = |g(x) — 9(0)| = |z9'(0)| 
= |x||e* — 1] < ela|? < elhl?. 


A third application with the function h(y) = sin(y) — y and derivative 
h'(y) = cos(y) — 1 yields a point c between 0 and y so that 


| sin(y) — y| = |yl| cos(e) — 1] < [yllel? < Iyl’. 
Together with the inequality |e”| < e for |x| < 1 yields 
lie (sin(y) — y)| < ely|? < eh]. 
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Finally, the fourth term is handled by 2|zy| < x? + y? = |h|?, so 
ly(e* — 1)| < elyl|a| < 2IAI?. 
Putting it all together yields, for |h| < 1, 
|E(h) —1| < elhl? + elh|? + elh|® + 2|Al? = (2e + 24+ elh|)|Al?. 
Thus 


Exercises 


1. (a) Graph the image of a line parallel to the y-axis under the exponential 
map. 

(b) Graph the image of a line parallel to the z-axis under the exponential 
map. 

(c) Show that the strip {z = «+ iy: 0 < y < 2m} is mapped by the 

exponential function one to one and onto the whole complex plane. 


2. (Sum Angle Formula for sin and cos) Use the formulae cos(z) = 
(e +e”) /2 and sin(z) = (e — e~*) /2i. 
(a) Prove that sin(w + z) = sin(w) cos(z) + cos(w) sin(z). 
(b) Prove that cos(w + z) = cos(w) cos(z) — sin(w) sin(z). 


Find all solutions of sin(z) = 2. 


4. Let f(z) = wisin(z) + we cos(z), where w1, w2 € C. Compute f”(z). 


5.5. Fundamental Theorem of Algebra 


In this section, we will prove the famous Fundamental Theorem of Algebra 
that states that every polynomial with complex coefficients factors into a 
product of linear terms. It is not easy to prove this theorem in a strictly 
algebraic way. Indeed, one can argue that it is really the analytic properties 
of polynomials that make this result transparent. There are several acces- 
sible proofs. They all rely on some property of functions that depends on 
the completeness properties of the real and complex numbers. This proof 
depends on the Extreme Value Theorem: a continuous real valued function 
on a closed bounded subset of the plane achieves its mazimum value at some 
point. If you have only seen this for functions on an interval, see Exercise [4] 
We begin with a preliminary lemma. 


5.5.1. Lemma. Let F be a field and assume that every polynomial with 
coefficients in F has a root in F. Then every polynomial with coefficients in 
F of degree d > 1 factors into a product of d linear terms. 
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Proof. We will use induction on the degree d of polynomial with coefficients 
in F. For d = 1, the result is clear. Now for d > 1, if p is a polynomial of 
degree d, by hypothesis it has a root r € F. So, p(z) = (z —r)q(z) with q 
a degree d — 1 polynomial. By the induction hypothesis, q(z) factors as a 
product of d—1 linear factors. Hence p(z) factors as the product of d linear 
factors. a 


5.5.2 Fundamental Theorem of Algebra. Every polynomial with 
complex coefficients of degree d > 1 factors into a product of d linear terms. 


Proof. Let p(z) = ae az’ be a polynomial of degree d > 1; so that 

aq # 0. By Lemma [5.5.1] it is enough to show p has a complex root. 

Assume, to the contrary, that p(z) is never 0. In particular ap = p(0) 4 0. 
The proof will be divided into 3 main steps: 


(1) Find a global minimum for |p]. 

(2) Normalize p to obtain a polynomial g with min |q(z)| = 1 = q(0). 
Then we may write q(z) = 1+ q(z). 

(3) Show qo(z) achieves a small negative value, contradicting the fact 
that the minimum of gq is 1. 


A key point to observe here is that Steps 1 and 2 work over R, so it is only 
in Step 3 where we make use of C. 


Step 1. Notice that 


Ad—-1 Ad—2 a 
+ Ae sca | Oe 
2 gd 


lim |p(z)| = lim |z|*/ag+ 
|z| +00 |z| 00 


since the second factor tends to the finite non-zero limit |ag| and |z|¢ tends 
to infinity. Thus there is a large real number R so that |p(z)| > |ao| for all 
Elba 

By the Extreme Value Theorem applied to the continuous real valued 
function f(z) = —|p(z)| on the closed bounded set {z € C: |z| < R}, there 
is a point zp so that 


Ip(z0)| < |p(2)| for all 2 EC, |z|< B. 


But for |z| > R, one has |p(z)| > |ao| = |p(0)| > |p(zo)|. So |p(z)| achieves 
its global minimum at Zo. 


Step 2. To simplify the computations, replace p(z) by the polynomial 
Plz + Zo 
q(z) = ne, 
p(20) 


Notice that q(z) is also a polynomial of degree d which is never 0, and |q| 
takes its minimum value 1 at z = 0. That is, 


1=4q(0) < |g(z)| forall zeEC. 
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The constant term of g is 1. Let b be the next non-zero coefficient; so that 
q(z) = 1+ bz" + higher order terms = 1 + bz*r(z) 


where r is another polynomial such that r(0) = 1. 
Step 3. Since r(z) is continuous, there is a positive real number ¢ so that 


1 
Ir(z) -1|< 5 for |z| <e. 


Choose an angle 6 so that be’*? is a negative real number. Indeed, one can 
take 9 = — Arg(b)/k. Set w = ce’, and note that because of the choice 
of 8, one has bw* = —|ble*. By replacing ¢ by an even smaller positive 


number if necessary, we can also suppose that |bw*| < 1. Let us also write 
r(w) =1+4, where |u| < 5. Therefore, 
gw) = 1+ bw*r(w) =1 = [be*(1 + u) = (1 = [dlet) + Jblek. 
Hence 
k ek ef 
la(w)| <1 — [ole + IS =1- PS <1. 


This contradicts the fact that q has minimum modulus 1. So the as- 
sumption that q and p have no roots is false. Hence p has a root. Therefore 
by Lemma [5.5.1] the proof is complete. | 


Exercises 


1. Let f(x) = 24+ ag_j2%!+---+ a9 with a; € C. Prove that every root 


a of f satisfies 
d-1 
lal < max {1} |ay|}. 
j=0 


2. (Cauchy’s bound) Let p(x) = 2” + ania"! + +++ + ao be a monic 
polynomial, and let r be a root. Prove that |r| < 1+ A where A = 
max{|aj|:0<i< nb}. 

Hint: if |r| > 1, use r” = — “775 a;r? to bound |r|”. 


3. (Partial fraction decomposition) Let f and g be polynomials with 
complex coefficients. We may write g(x) = [[j_,(«—ri)™ with r,..., 
rm € C distinct. Prove that there is a polynomial h(x) and constants 
aij € C such that 


$2) _ gy 4 SOE _ Hi 
gta) + Goa 


4* (EVT for a rectangle) Assume the Extreme Value Theorem for a con- 
tinuous function on an interval [a,b] C R, and prove it for a continuous 
function f(x,y) on a rectangle [a, b| x |[c, dj. 
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Hint: for x € [a,b], let f(y) = f(x,y) be a function on [c,d]. Find y(x) 
so that f, attains its maximum at y(x). Let g(x) = max f,. Show that 
g is continuous on [a, b]. 


5X Let f(x) = 2" + an_12""!+--- + ag with a; € Z and suppose that 
|@n—1| > 1+ |an—2| +--+ + lai] + laol. 


Let 21,...,2n € C be the roots of f. Prove there is a unique 7 such that 
|zi] > 1, and |z;| < 1 for all 7 Ai. 


6* (Gershgorin Disc Theorem) Let A = (ajj)1<i,j<n be an n Xn matrix 
with the aj; € C. Forl <i<n, let Rj = dizi yl Let ’ € C be 
an eigenvalue of A, i.e., there exists an n x 1 matrix v # 0 such that 
Av = dv. Let 


D(a, Ri) = {z € C: |z — ay| < Ri}, 


known as a Gershgorin disc. Prove that 4 lies in a Gershgorin disc. 
HINT: pick io so that |v;,| = max{|v;| : 1 <i <n} and look at the ioth 
coefficient of Av — Av. 


5.6. Real Polynomials 


The theory for real polynomials is not quite as simple as for complex polyno- 
mials because certain real polynomials do not factor into real linear factors. 
However, we may use the Fundamental Theorem of Algebra to figure out 
what happens. 


5.6.1. Lemma. Let p(x) be a polynomial with real coefficients. Then if a 
is a complex root of p, then G@ is also a root. 


PROOF. Let p(z) = a pz’. This is immediate from the observation 


d d 
p(a) = > pia’ = N° pjaé = p(a) = 0. | 
1=0 1=0 


5.6.2. Theorem. Every real polynomial factors into a product of linear 
and quadratic factors, in which the quadratic factors have no real roots. 


Proof. By the Fundamental Theorem of Algebra, the polynomial p can be 
factored into linear complex terms. By factoring out the leading coefficient, 


p(x) = c(x — a1) (az — ag)... (@ — ag). 


'This is due to Panaitopol. See https: //yufeizhao.com/olympiad/intpoly. pdf. 
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Now c is real. Whenever a; is real, x — a; is a factor of p over the real 
field. When a; is not real, the lemma shows that there is an integer 7 so 
that a; = aj. In this case, write aj = u+ iv and a; = u— iv. Then 


(x — a;)(2 — aj) = 2? — Que + (u? + v”). 


This is a real polynomial. 

It remains to show that all the roots come in pairs. This is seen by 
induction on the degree of p. Indeed, this is true for degree d = 1. If the 
result holds for all real polynomials of lower degree, consider the case for 
p above. If p has a real root a , then p factors as p(x) = (x — a1)pi (2). 
Moreover, it is clear that division of p by © — a, uses only real coefficients. 
So p, is real. Similarly, if p has a pair of non-real roots aj = u+iv and 
a2 = u— iv, then p factors as p(x) = (x? — 2ux + u? + v?)pi(zx). Again, 
division of a real polynomial by another leaves a real quotient. In either 
case, the induction hypothesis applies to p;(x) and it factors as a product 
of linear terms and quadratic terms with non-real roots. Hence the result 
follows for p as well. | 


We get the following immediate corollary about real polynomials of odd 
degree because at least one of the factors must be of odd degree (hence 
degree 1). This is also an immediate consequence of the Intermediate Value 
Theorem. 


5.6.3. Corollary. A real polynomial of odd degree has at least one real 
root. 


Somehow we have managed all this discussion of factorization without 
any discussion of uniqueness. Of course, this is a crucial issue. Because 
the factorization over the real or complex numbers is intimately connected 
with roots, this question can be handled here by special ad hoc arguments. 
However, we will see in the next chapter that the polynomials over any 
field always have unique factorization into ‘primes’, known as irreducible 
polynomials. 


Exercises 


1. Show that a quadratic p(x) = ax? + bx +c with real coefficients has two 
(possibly equal) real roots if and only if the discriminant A(p) = b?—4ac 
is non-negative. 


2. (Partial fraction decompositions, again) Let f and g be polyno- 
mials with real coefficients. Write g(x) = [[j_, (a — ri)"™ 14 q(x) “i 
with 7] ..., 7% € R distinct and the q;(x) quadratic polynomials with no 
real roots. Prove that there is a polynomial h(x), constants aj, € R, 
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and linear polynonomials ¢;,(a2) € R[z] so that 
N Mj; 


f(x) _ a aes 
y+ 
a) MOL eet LL aS 


(Descartes’s Rule of Signs) Let p(x) be a polynomial with real coef- 
ficients. Write p(x) = a;,2" +---+a;,,2°" where iy > ig > ++: > im > 0 
and all a;,; # 0. Let s be the number of sign changes in the coefficients 
of p, i.e., s is the total number of times for which a;,a;,,, <0. Let t be 
the amber of positive roots of p, counting multiplicity (e.g., a factor of 
(2 — r)* counts as k roots). In this exercise, we prove that t < s and 
s —t is even. 


(a) Reduce to the case in which a;, = 1 and i, = 0. 

(b) Using Calculus, show that if a;,a9 > 0, then there are an even 
number of positive roots, and if a;,a9 < 0, then there are an odd 
number of positive roots by comparing the behaviour of p near 0 
and oo. 

(c) Conclude that s — t is even. 

(d) Let r > 0. Show that if aj,a;,,, <0, then the coefficient of ait! 
n (x —1r)p(x) agrees in sign with aj,,,- 

(e) Combine these facts to show that the number and parity of the sign 
changes must increase when multiplying by x — r. 

(f) Using induction on the number of positive roots, prove Descartes’s 
Rule of Signs. 


Notes on Chapter 5 


A precise notion of the real and complex numbers as we know it is a rather 
modern idea. To the ancient civilizations, numbers were positive integers. 
(See the notes to Chapter [I]) Positive rationals were considered as ratios 
between two positive integers. The discovery that certain square roots such 
as /2 were not ‘commensurable’ with the integers was disturbing. Some, 
like the Babylonians, considered successive rational approximations of these 
numbers. Nevertheless, no notion of an extended number system developed 
at that time. 

Stevin proposed the use of finite decimals to represent numbers in 1585. 
He recognized that arbitrary quantities could be approximated by his dec- 
imals. But since a value like $ did not have an exact representation, most 
others rejected the idea. Even 200 years later, Euler considered the real 
numbers as the set of all ‘magnitudes’, and apparently no definition was 
considered necessary. However Euler introduced the notion of a variable x 
which could take any magnitude. Since roots of equations were by this time 
known that may not be real, it raised the issue of what a real number was. 

Bolzano, in 1817, had a notion of real numbers and completeness using 
Cauchy sequences of rationals; but never published. Cauchy also considered 
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the notion of convergent sequences of rationals, but did not propose a proper 
theory. In 1858, Dedekind published his theory of the reals using cuts. In 
1869, Meray published a construction using Cauchy sequences. Weierstrass, 
Cantor and Heine also had related approaches. In 1900, Hilbert developed 
an axiomatic approach: axioms for an ordered field together with two critical 
axioms, the Archimedean property (there are no numbers x such that 0 < x 
and x < 1 for all n > 1) and a completeness property. He established the 
uniqueness of such a field, thereby showing that different constructions such 
as Dedekind’s and Meray’s must yield identical objects. 

Surprisingly quadratic equations did not lead to the discovery of complex 
numbers, because a simple check of the discriminant determines whether 
there are (real) solutions or not. In the early 1500’s, del Ferro and Tartaglia 
found the formula for the roots of a cubic. This involved square roots of 
negative numbers even when the roots are real. Cardano, who found the 
formula for the roots of a quartic, considered numbers of the form a+ /—b. 
He was not convinced that they were bona fide quantities, but they worked. 
Descartes, in 1637, coined the term ‘imaginary number’. Euler introduced 
the use of i = /W—1, as well as the polar form. Argand came up with 
the notion of representing complex numbers in the plane in 1806. In 1831, 
Hamilton described the complex numbers as ordered pairs of reals, (a,b), 
with vector sums and product (a,b)(c,d) = (ac — bd, bc + ad). Gauss was 
aware of the geometric representation of complex numbers in 1796, but did 
not publish it until 1831. In 1847, Cauchy constructed the complex numbers 
as an extension of the reals, R[a]/(z? + 1). Cauchy was also responsible 
for the beginnings of complex function theory (calculus for complex valued 
functions). 

The fundamental theorem of algebra was proposed by Roth and later 
Girard in the early 1600’s, both stating that a (real) polynomial of degree 
n may have n roots. D’Alembert had a proof in 1746, but it had a gap. 
Euler, Lagrange and others made attempts, but implicitly assumed that 
there was a field extension in which the polynomial already has n roots. 
Wood in 1798 and Gauss in 1799 published proofs, that also had gaps. In 
1806, Argand published the first rigorous proof. Moreover he was the first 
to allow complex coefficients for his polynomials. Gauss published two other 
proofs in 1816. The proof that we give uses the extreme value theorem. A 
proof of this was found by Bolzano in 1830, but never published. It was 
later proven by Weierstrass in 1860. The extreme value theorem depends 
on the completeness property of the real numbers. 

Various introductory books on real analysis provide some construction 
of the real numbers. The uniqueness is more subtle. A treatment of both 
can be found in Garling Ch.2-3]. The fundamental theorem of algebra 
is fundamentally a result in analysis. Standard comprehensive treatises on 
algebra usually assume that polynomials of odd degree have a real root. This 
is basically assuming the Intermediate value theorem, which is a consequence 
of the completeness of the reals. Other proofs using complex analysis can 
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be found in many texts; for example Simon has three proofs. The 
proof given in our book is perhaps the simplest if one knows about the 
completeness of the real numbers. 


Chapter 6 


The Ring of Polynomials 


In this chapter, we investigate the algebraic properties of polynomials. The 
reader should notice the parallels between the structure of the integers and 
the structure of the polynomials. Most of the ideas that have been devel- 
oped for integers, such as primes, modular arithmetic, and so on, have a 
polynomial version. 


6.1. Preliminaries on Polynomials 


We use the notation R[x] to denote the set of all polynomials with coefficients 
in aring R. That is, an element of R[x] is an expression of the form 


d—1 


d 
Tq + q_-1x +... +712 +179 


where x is a formal symbol and the coefficients r; belong to R. In particular, 
we are especially interested in the case when R is a field. So we will use the 
symbol F whenever we mean the result works for any field. The fields of 
interest to us at the moment are the rationals Q, the reals R, the complex 
numbers C, and the fields Z, for p prime. So F[z] will indicate any of Q[z], 
R[x], C[z] or Z,[xz]. Whereas R[x] may indicate Z[x] or Z,n|x] for composite 
nas well. 
Addition of polynomials is defined as follows: 


n n 


nr 
Ss" rot + Ss" sv = Sori + s;)a°. 
i=0 


i=0 i=0 
Multiplication is defined by the rule 
(ra™)(sx) = (rs)a'™*™™ 


together with the consequences of the distributive law, i.e. 


(Sone) (See) = v Os 
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The zero element is the constant zero polynomial 0 € R C Riz], and the 
multiplicative identity is the constant polynomial 1 € RC R[x]. One checks 
that with these operations, R[z] is a ring. If R is a commutative ring, then 
we see R[x] is as well. 

The degree of a non-zero polynomial p(x) = S71", pix" is the largest 
integer deg(p) = d so that pg # 0. There is no natural degree for the 0 
polynomial, but it is convenient to define deg(0) = —oo since it makes the 
following lemma work. 


6.1.1. Lemma. Let R be an integral domain. Then R[x] is an integral 
domain. Furthermore, if p,q € R[x], then 


deg(pq) = deg(p) + deg(q). 
Proof. If p = 0, then pg = 0 and we see 


deg(pq) = —o0o = —o0 + deg(q) = deg(p) + deg(q). 


Therefore, we may assume both p and q are non-zero. Let deg(p) = d and 
deg(q) = e with d,e > 0. Then 
d 
p(x) = pax’ + lower order terms = Spe 
i=0 
e 
q(x) = qex° + lower order terms = > Ge 
j=0 
Thus a computation shows that 


pq(x) = (paqe)x4*® + lower order terms. 


Since pa, ge # 0 and R is an integral domain, page # 0. Therefore, pq 4 0 
and 


deg (pq) = d + e = deg(p) + deg(q), 
as desired. | 
Observe that if R is not an integral domain, then Lemma [6.1.]] fails. 


For example, if R = Ze[z], p = 2x? +1 and q = 32, then pg = 32 so 
deg(pq) = 1 # 4 = deg(p) + deg(q). 


Even when F is a field, the ring F[z] is not a field. The element x never 
has an inverse in F[z], as the following lemma shows. 
6.1.2. Lemma. /f R is an integral domain, then the units 
ila = 
In particular, for a field F, the group of units Fla|* = F* = F \ {0}. 
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Proof. If r € R*, then there exists s € R* such that rs = 1. This equality 
persists in R[x], so r € Ria]*. 

Conversely, if p € R[x]*, then there exists q € R[x] such that pq = 1. 
Applying Lemma [6.1.1] we have 


0 = deg(1) = deg(p) + deg(q). 

Since p and q are non-zero, deg(p), deg(q) > 0. Thus, deg(p) = deg(q) = 0, 
ie.p,qe R,sope R*. | 

Again, this may be false if R has zero divisors. For example, consider 
Za. 

(22 +1)? = 42? +4r+1=1 (mod 4). 

So 2x +1 is a unit of Z4[x]. Notice how the degree of the product turned 
out to be smaller than expected. 

We end this section with two useful lemmas whose proofs we leave as 


exercises. Lemma |[6.1.4]explains why the notation 7/9 riz’ is particularly 
helpful: we can plug in elements of R in place of x. 


6.1.3. Lemma. Let n> 2 be an integer and let 
nw: Zr] > Z,[z] 
be the function defined by 


n( » ri) = S Tria’ 
i=0 i=0 
where [r| is the equivalence class of r inZy. Then x is a ring homomorphism, 
i.e. 7(1) =1, a(r+s) =a7(r)+n(s), and a(rs) = 1(r)x(s) for allr,s € Z[a]. 


6.1.4. Lemma. Let R be a commutative ring and let a € R. Consider 
the evaluation map 

eva: R[x] 3 R 
defined by 


n nm 


eva ( ) ri) — ) ria’. 
i=0 i=0 
Then evg is a ring homomorphism. 


Exercises 


1. Prove Lemma|6.1.3 
Prove Lemma 


3. Let F and G be fields with F C G. Prove that if p,q € F[z], then p | q 
in F[a] if and only if p | ¢g in Giz]. 
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4, Show that Exercise [3]is false if F and G are allowed to be rings instead 
of fields. 


5. Show that if R is an integral domain, then R[z] is also an integral do- 
main. 


6. (Field generated by an element) Let F and G be fields with F C G. 
Let a € G and let 
(a) = {42 ; f.ge Pla, glo) £0}. 
g(a) 
Prove that F(a) is a field with F Cc F(a) C G. It is referred to as the 
field generated by a. 


7. (Multivariate polynomial rings) Let R bearing. Define R[x1,..., rn] 
to be the set of formal sums 0 ye Tit jecin LES --. 0? with coef 
ficients 7rj,,...,i, € R. Define addition and multiplication analogously to 


how it was defined for R[x]. Prove that R[x1,...,2p] is a ring. 


8. Let R be a ring. Show (R[2])[y], (R[y})[z], and R[x, y] are isomorphic. 


6.2. Unique Factorization for Polynomials 


In this section, we will prove the division algorithm for polynomials, and 
show, as for the integers, that this leads to a Euclidean algorithm and unique 
factorization in F |], where F is a field. 


6.2.1. Definition. If R is a commutative ring, a non-constant polynomial 
p in R[z] is called irreducible if for every factorization p = gr in R[x], either 
q(x) or r(x) is a unit. 


Notice that this definition is a special case of the one given in Definition 
1.8.6] In particular, it coincides with the definition of a prime in Z or 
Z|Vd]. The term irreducible is used instead of prime for historical reasons. 
We are primarily interested in polynomials over a field. However, we will 
have reason to consider polynomials in Z[z]. 

The (long) division algorithm for polynomials is often taught high school. 
The technique is to divide the leading term of p into the leading term of q. 
Subtraction leaves a remainder of lower degree. Proceed iteratively until a 
remainder of degree less than deg(p) is achieved. This can easily be done by 
hand, or by computer. 


6.2.2. Proposition (Division algorithm for polynomials). Sup- 
pose that q #0 and p belong to F[x|. Then there is a unique quotient a and 
remainder r in F|x] so that 


p=aqt+r and deg(r) < deg(q). 
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Proof. Proceed by induction on the degree of p. If d := deg(p) < deg(q), 
take a = 0 and r = p. Otherwise, d > deg(q) =: n. Suppose that the result 
holds for all polynomials of degree less than d. Let 


d= dnx” + lower order terms and p= pqx” + lower order terms, 


where g, and pg are non-zero. The polynomial 


pi(a) = p(x) — (pagn a" q(x) 


= (pax" + lower order terms) = (pax + lower order terms) 
= lower order terms. 
It follows that deg(p1) < d = deg(p). So by the induction hypothesis, the 


polynomial p; can be written as p; = a1q+r where a; and r belong to F{z 
and deg(r) < deg(q). Therefore, 


p(x) = pi(x) + (pagn a" q(x) 


= ((pagq*)e*" + a1(x)) q(x) + r(2). 


This establishes existence. 
For uniqueness, notice that if g|p and deg(p) < deg(q), then p = 0. This 
is because the identity p = aq implies that 


deg(p) = deg(a) + deg(q). 


Only deg(p) = deg(a) = —oo makes this possible, for otherwise the right- 
hand side is strictly larger. So p=a= 0. 

Now suppose that p = ayqg +11 = a2q +72 where both remainders have 
degree less than deg(q). Then q divides (a1 — a2)q = rg — 11. Since 


deg(rg — 11) < max{deg(r1), deg(r2)} < deg(q), 
the previous argument shows that r2 — 11 = 0. Therefore we obtain 71 = rg 


and a, = ao. |_| 


6.2.3. Corollary. The linear polynomial x — c divides a polynomial p if 
and only if p(c) = 0. 


Proof. Divide x — c into p by the division algorithm to obtain a quotient a 
and leave a remainder r of degree at most 0. So r is a constant. Then 

p(c) =a(c)(c—c)+r=r. 
So « — c divides p if and only if the remainder p(c) equals 0. a 
6.2.4 Euclidean algorithm for Polynomials. [fp and q are non-zero 


elements of Fx], then there exists a greatest common divisor d in Fla] with 
the properties: 
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(1) dip and d\q, 
(2) there are polynomials s and t such that d = ps + qt, 
(3) if blp and blq, then bid. 


Proof. By Proposition [6.2.2] Fla] is a Euclidean domain. So, the result 
follows from Theorem [1.8.12 a 


It follows that we obtain the important consequence of unique factoriza- 
tion for polynomials over a field. 


6.2.5 Unique Factorization for Polynomials. Every polynomial in 
Fla] factors uniquely into a product of irreducibles. That is, if r(a) factors 
into irreducible terms as 


then m = n, and there is a permutation 7 and non-zero scalars c; € F* so 
that dni) = Cipi- 


Proof. By Proposition and Remark[L.8.9} the hypotheses of Theorem 
1.8.18] hold. So, F[z] has unique factorization. a 


Exercises 


1. Find gced(f,g) and express it as a polynomial combination of f and g 
for the following examples in Q[z]. 
(a) f(x) = 2* + 7x? + 182? + 202 + 8 and 
g(x) = 2* + 6x? + 7x? — 6x — 8. 
(b) f(a) = 2a* + 303 +227+32+2 and g(x) =2t+22-—2—-1. 


2. Factor p(«) = «++ 1 completely into irreducibles in each of the follow- 


ing: 
(a) (i) Ql] (ii) Ria] (iii) Cla]. 
(b) (i) Zale] (ii) Zs[x (ili) Zr[a]. 


3. (a) Show that a polynomial p € F[2] of degree 2 or 3 is irreducible if 
and only if it has no roots in F. 
(b) Give an example that shows that this is false for degree 4. 


4. Let f € Q[a], and let f’ be its derivative. Suppose that p(x) is an 
irreducible polynomial. Show that p| gcd(f, f’) if and only if p?|f. 


5. Show by example that Proposition is false if F is replaced by an 
arbitrary commutative ring. 


6. Let F and G be fields with F Cc G. Let p,q € Fla]. Let f be gcd(p,q) 
computed in F and let g be gcd(p,q) computed in G. Prove that f = g. 
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6.3. Irreducible Polynomials in Z|z| 


Any polynomial in Q/z] can be multiplied by a large integer to clear the 
denominators and leave a polynomial with integer coefficients. It is a con- 
venient fact, proven by Gauss, that a polynomial in Z[a] factors in Q|z] 
only if it factors in Z[z]. In other words, it is not necessary to use fractions 
to factor integer polynomials over the rationals. This makes it possible to 
obtain certain simple tests providing sufficient conditions for irreducibility. 


6.3.1. Definition. A polynomial in Z[z] is called primitive if the gcd of 
its set of coefficients is equal to 1. 


6.3.2. Lemma. /fr and s are primitive polynomials in Z|x], then rs is 
also primitive. 


Proof. Suppose the coefficients of rs have gcd not equal to 1. Then there 
exists a prime p that divides all of the coefficients of rs. Given a polynomial 
f € Z[x], let 7(f) € Z,[x] denote the polynomial obtained by reducing the 
coefficients mod p, as in Lemma [6.1.3] Then a(r)a(s) = a(rs) = 0. By 
Lemma [6.1.1] Z,[2] is an integral domain, so either 7(r) = 0 or 7(s) = 0. 
Without loss of generality, 7(7) = 0. In other words, p divides all of the 
coefficients of r, and therefore r is not primitive. a 


6.3.3 Gauss’s Lemma. A polynomial p € Z[x] factors in Q\x] only if it 
factors in Z|a]. Furthermore, if p factors as p= rs in Q|a], then there are 
rational multiples r’ of r and s' of s such that r’, s' € Z[x] and p=r's'. 


Proof. Suppose that p factors as p = rs in Q[z]. Choose integers M and 
N so that Mr and Ns have integer coefficients. Let m be the gcd of the 
coefficients of Mr, so that Mr = mr, where r, is a primitive polynomial 
in Z[x]. Similarly, let n be the gcd of the coefficients of Ns, and factor 
Ns = 7s, where s 1 is also primitive. 

Compute 


(6.3.4) MNp = (Mr)(Ns) = mn(r1s1). 


By the lemma above, the polynomial rs, is primitive. So the gcd of the 
coefficients of this product is mn. Let d be the gcd of the coefficients of p. 
We obtain the equation 


MNd=man. 
Thus, dividing equation (6.3.4) by MN = "1 yields 


p=dr 1s, 


which is a factorization in Z[x]. Taking r’ = dry = Ma, and s’ = 8; = 


Ns 
n°? 
we have completed the proof. | 
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The following corollary of Gauss’s Lemma characterizes irreducible poly- 
nomials in Z[z]. 


6.3.5. Corollary. Let p € Z[xz]. Then p is irreducible in Za] if and only 
if p is primitive and irreducible in Q{z]. 


Proof. First suppose p() is irreducible in Z|]. Gauss’s Lemma|6.3.3]shows 
that p(x) remains irreducible in Qa]. Now, let d be the greatest common 
divisor of the coefficients of p(x). Then p(x) = dq(x) with q(x) € Z[z]. By 
irreducibility of p, we must have d = 1 and so p is primitive. 

Conversely, suppose p(x) is reducible in Z|]. Then we may factor p(x) = 
q(a)r(x) with q(a),r(a) non-zero non-units in Z[z]. If both q and r have 
positive degree, then we see p(x) is reducible in Q|z]. If on the other hand, 
deg(q) = 0, then q € Z is a common factor of the coefficients of p(x). Since 
q is not a unit in Z[z], we have q 4 +1, showing p(x) is not primitive. MH 


The next result is a well known criterion for finding rational roots of 
polynomials in Z{z]. 


6.3.6 Rational Root Theorem. /f gcd(a,b) = 1, and § is a root of 
p(x) = eg pix’ € Z[a], then blpm and alpo. 


Proof. By Corollary [6.2.3] 2 — ¢ is a factor of p in Q|z]. The rational 
multiple of « — — which is primitive is precisely br — a. From Gauss’s 
Lemma, bx — a must be a factor of p. That is, 


p(x) = (bx — a)q(x) = bdm—120"™ +... — ago. 


SO Pm = bdm—1 and po = —aqo. a 


For example, consider the polynomial x? +2 +1. By the criterion above, 
the only possible rational roots are +1. Substituting +1 into the above 
polynomial, we see neither is a root. So this cubic has no linear factors. 
Therefore, it is irreducible. 

Similarly, consider p = x* + 22° + 4x? + 4x + 4. By the corollary, the 
only possibilities for rational roots are +1, +2, and +4. Trial shows that 
none are roots. (Clearly, p has no positive roots. This cuts down on the 
number of trials.) However, this only means that p has no linear factors. It 
does not imply that p is irreducible. And, in fact, it is not. It factors as 


p(t) = (x? +24 2)" — x? 
a (x? + 2) (x? + 2x + 2) 
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Exercises 


1. Factor 82° — 6x + 1 in Z[z] or prove that it is irreducible. 
2. Factor «* — 5x2? +6x+1 in Z[z] or prove that it is irreducible. 


3. Let f(x) = 2° +3244 2234272 +2—2 and g(x) = 2° + 204 +223 — 2? — 
4x + 2. 
(a) Find ged(f, 9). 
(b) Hence factor f and g completely in Z[z]. 


4, Find a quartic polynomial in Z[z] with V5 — 2V3 as a root. Factor it 
completely in R[x]. Then prove that it is irreducible in Q[z]. 


5. Prove the following generalization of Gauss’s Lemma Let R be 
any UFD and let K be its fraction field, as defined in Exercise [5] of 
Section [2.4] Prove that a polynomial p € R[2] factors in K’[] only if it 
factors in R[x]. Furthermore, prove that if p factors as p = rs in K [a], 
then there exist a,b € K* such that r’ = ar € R[z], s’ = bs € R{z], and 
p=T s. 

6. (a) Prove that Z[z] is a UFD. 

Hint: Use that Q|z] is a UFD. 
(b)* (Gauss’s Theorem on UFDs) Prove that if R is a UFD, then 
R{x] is a UFD. 


6.4. Eisenstein’s Criterion 


In this section, we develop another test for irreducibility that carries these 
ideas a little further. 


6.4.1 Ejisenstein’s Criterion. Let p= aia ppe® € Z[x]. Suppose that 
q is a prime integer such that q| pi for0 <i<d, q does not divide pa, and 
q° does not divide po. Then p is irreducible in Q{a}. 


Proof. Suppose to the contrary that p is reducible in Q[z]. Then by Gauss’s 
Lemma, we may write p= rs in Z[az] with deg(r), deg(s) > 0. Write r(x) = 
i riz’ and s(x) = se sjx) with ry,sy #0 and I,J >1 Then I,J <d. 
The hypothesis tells us that g does not divide pg = rysz, hence gq does not 
divide r; and does not divide sz. Since g does divide pp = r9s0 but q? does 
not, it follows that q divides one of rp or so, but not the other. Without loss 
of generality, g\ro and q does not divide so. Let ig < I be the least integer 
for which q does not divide r;,. Then 


Pip = (To Sig See al al Tip —181) + Tip So- 
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From the choice of ig, it follows that qg divides each term in the bracketed 
sum, but does not divide r;,59. Thus g does not divide p;,. Since 79 < I < d, 
this is contrary to the hypotheses. Therefore p must be irreducible. | 


6.4.2. Example. For example, let us find an irreducible polynomial with 


sin(2*) as a root using de Moivre’s Theorem. Let us write c := cos(2#) and 
s= sin( 22). Using the formula 
1 = cos(2m) + isin(27) = (c +s)", 
and taking the imaginary part of both sides, one obtains 
0 = 7c°s — 35¢*s* + 21c*s° — 8" 
STs ss Sh st) + 2s)? Ss" 
= —s(64s° — 112s* + 56s? — 7) 


Since s = sin(2) # 0, it is a root of the polynomial 
p(x) = 642° — 11224 + 56x? — 7. 


This polynomial is a perfect candidate for Eisenstein’s criterion. Note 
that gcd(64,7) = 1, but 7 divides —112, 56 and —7. Since 7? does not divide 
—7, it follows that p is irreducible in Z[z] and hence in Q|z]. In particular, 

2a 


p has no rational roots. So sin(+*) is irrational. 


6.4.3. Example. Sometimes, one has to be clever to find a way to use 
Eisenstein’s criterion. Let q be an integer prime. Let 


p(x) = 


There is no obvious way to use the method here. However, sometimes a 
substitution helps. Notice that p(x) factors as p = rs if and only if p(a +1) 
factors as p(x +1) =r(a+1)s(x+ 1). So compute 


1 
Sgt l4gt? tot e+ 1, 
z—1l 


(c+1)7-1 
1) = ———— 
ROO SG cid 
Lgl q i q\ k-1 
k k; 
k=1 k=1 
-—1 -—1 
= 21 4 gat? 4 = ee ee — Tess , 
Notice that the leading coefficient is (2) = 1, the constant coefficient is (7) = 
! 
q, and the other coefficients are ca) es for2<k<q-1. This is 
k k\(q—k)! 


always an integer. Now gq divides the numerator, but not the denominator. 
Thus each is divisible by g, while the constant coefficient is not divisible by 
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q’. So p(x + 1) satisfies Eisenstein’s criterion, and thus is irreducible. So p 
is irreducible as well. 

The roots of p are the q — 1 q-th roots of unity other than 1, namely 
e2kri/4 for 1 <k<q-1. 


Exercises 


1. Prove that x° — 210z4 — 903z3 + 168x — 315 is irreducible in Z[z]. 
2. Ifn>1 is a square free integer, show that x% — n is irreducible in Z[z]. 
3. Prove that «° — 22x74 + 196° — 887x? + 2036x — 1886 is irreducible in 
Zz]. 
HINT: substitute x — 1 for x. 
4. Prove that x’ — 142° + 842° — 280x4 + 56023 — 6722? + 4592 — 29 is 


irreducible in Z[z]. 
HINT: find a substitution that helps. 


5. Prove that if n is composite, the polynomial 2”~! + a"? +---+a+1 
is reducible. 


6. (Schur) Prove the following special case of a result of Schur: if p is prime 
and aj,...,@p—1 € Z, the polynomial 1 + ye Ge gk + z is irreducible 
in Q|z]. 

7. Prove the following generalization of Eisenstein’s Criterion [6.4.1] Let R 
be any UFD and let K be its fraction field, as defined in Exercise [5] of 
Section 2.4] Let q € R be irreducible and let p = ~ pee’ € Ria]. 
Suppose that q | p; for 0 < i < d, q does not divide pg, and q? does not 
divide po. Prove p is irreducible in Kz]. 


8. Prove x” + y” — 1 is irreducible in Q[z, y] for all n > 1. 
HINT: consider this as an element of (Q/z])[y] and note that x” — 1 has 
a linear factor. 


6.5. Factoring Modulo Primes 


Another simple test for irreducibility is to study the factorization of f(x) 
modulo p for various small primes p. 


6.5.1. Lemma. /f f € Zz] is reducible in Q|x], then it is reducible modulo 
p for every prime p relatively prime to the leading coefficient of f. 


The reason for the condition on p in Lemma [6.5.1]is so that the degree 
of f does not decrease when moving to Z,. For example, the reducible 
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polynomial f(x) = 227 + 3r+1 = (2x + 1)(x +1) reduces to f=2+1 
(mod 2) which is irreducible. 


Proof of Lemma Fix a prime p. For any h € Z[z], let m(h) € Z,[z] 
be as in Lemma [6.1.3] By Gauss’s Lemma, we may write f = rs in Z[z] 
with deg(r), deg(s) > 0. Then z(f) = z(r)m(s). The product of the leading 
coefficients of r and s is the leading coefficient of f, and hence the leading 
coefficients of r and s are relatively prime to p. So 

deg(m(r)) = deg(r) and  deg(m(s)) = deg(s). 
Hence both m(r) and m(s) are non-trivial factors in Z,|[z]. a 


We state the contrapositive form as a corollary. 


6.5.2. Corollary. If f € Z[x] has leading coefficient coprime to p, and f 
is irreducible modulo p, then f is irreducible in Q|z]. 


6.5.3. Example. Let f(x) = 2° +52++6xr+1. By Gauss’s Lemma, the 
only possible roots are +1, neither of which works. So, if f factors at all, it 
must be into a product of a cubic and a quadratic polynomial. Modulo 3, 
this polynomial is 


f(z)=2° —ac*+1 (mod 3). 

The simplest approach is to find all the irreducible quadratic polynomials 
in Z3|z], and test them. The reducible quadratics are the ones with zero 
constant coefficient, and the three products (x + 1)(a + 1); namely, 2?, 
g?+e¢, 02-1, and #7+2+1. That leaves x?7+1 and #?+2-—1 as the three 
irreducible monic quadratic polynomials in Zs[{x]. A calculation shows that 
none of them divide f(x). Hence f is irreducible in Zs[{x]. Therefore it is 
irreducible over the rationals as well. 


6.5.4. Example. This method can also be used to factor polynomials, by 
using the Chinese Remainder Theorem. Consider f(a) = 2° —1223+172? 

10x + 2. The only possible rational roots are +1 and +2, none of which 
work. So if this factors, it is into the product of a cubic and a quadratic. 
Suppose that we have factored it mod 3 and mod 5 into irreducible factors. 


f(z) = (a? — x —1)(2? +1) (mod 3) 
f(x) = (a2 + 327+ 2+42)(x4+1)? (mod 5) 


The cubic term g(x), if it exists, is congruent to x? — x —1 (mod 3) and 
congruent to x° + 322 +a2+2 (mod 5). Solving this system of equations, 
we find that 


g(z) = 2° 4+3274+11a+2 (mod 15). 
Moreover, the leading coefficient divides 1, and hence must be 1; and the 


constant coefficient must divide 2. So it must be 2. Let us write g(x) = 
x? + ax? + bx + 2. 
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This forces the constant coefficient of the quadratic term h(x) to be 1. 
Since the coefficient of x4 in f(a) is 0, the coefficient of x in h must be —a. 
Let’s write h(a) = 2? —ax+1. Trial of small choices for a and b now yields 
the factorization 


a” — 1243 + 1727 — 102 +2 = (a? + 3a? — 4 4 2) (x? — 32 +1). 


This kind of search can be carried out with reasonable efficiency on a 
computer. However, this is not the standard algorithm used on computers 
to factor polynomials. The methods used will be discussed at the end of the 
next chapter. 


Exercises 


1. Reduce the polynomial in Section|6.4B]modulo 3 and factor it completely. 
Use this to show that the polynomial is irreducible in Z[z]. 


2. Reduce the polynomial in Section [6.44] modulo 2. Use this to show that 
the polynomial is irreducible in Z[z]. 


Decide if x° + 22 4+ 4 is irreducible in Z[x] by reducing mod 3. 


4. Prove that p(x) = 2* +1 is reducible in Z,[z] for every prime p. (Com- 
pare with Section |6.2]21) 
HINT: you need to know when —1, +2 are squares mod p. 


5. The polynomial q(x) = 2° — 6x* + 1423 + 12%? + 84x + 41 factors as 
q=(a?—x—-1)? (mod3) and q=(x—3)3(x+3)? (mod 7). 


Show that q is irreducible. 


6. Show that if n is odd and p is prime, then f(x) = x” — p? is irreducible 
in Z[z]. 
Hint: if f = gh, then g(x)h(x?) = (2" — p)(ax" + p). 

7. Show that x++12x? +182 +6 is irreducible in (ZJi]) [x]. Remember that 
2 is not a prime in the Gaussian integers. 


8. (Perron’s irreducibility criterion) 
Let f(x) = 2” + an_12""!+-+-+ aq with a; € Z and suppose 
|@n—1| > 1+ |an_a| +--- + [ai] + lao. 
Prove that f is irreducible in Z[z]. 
HINT: Use Section [5.5| Exercise [5] 


9. (A variant on Cohn’s irreducibility criterion) 
Let f(x) = aga? + ag_ja% 1 +---+.a9 with a; € Z and ag 4 0. Suppose 
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there is n € Z with f(n) prime and 


Qj 
n>2+ max |= : 
0<i<dl ag 
Prove that f(a) is irreducible in Z(x] 
HINT: Use Section 5.5) Exercise 2] to bound the roots of f. Show that 
it f =gh, then |g(n)| > 1. 


6.6. Algebraic Numbers 


6.6.1. Definition. A complex number w is called algebraic if it is the 
root of a polynomial in Q|z]. A monic polynomial p in Q|z] of least degree 
such that p(w) = 0 is called the minimal polynomial of w. 


We will establish that the minimal polynomial is unique. No particular 
properties of the field of rational numbers is used here. Indeed, if F is any 
field contained in a larger field G and w € G is a root of a polynomial in 
F[z], then the minimal polynomial of w is the monic polynomial of least 
degree in F[x] with w as a root. The following result is valid in this greater 
generality without any change in the proof. 


6.6.2. Theorem. The minimal polynomial p of an algebraic number w is 
unique. Moreover, p is irreducible, and if q is another polynomial such that 
q(w) = 0, then p divides q. 


Proof. If g and r are two polynomials such that q(w) = r(w) = 0, let 
s = gcd(q,r). By the Euclidean algorithm for Q|], there are polynomials a 
and b in Q[z] so that s = aq + br. Hence 


s(w) = a(w)q(w) + b(w)r(w) = 0. 


In particular, the monic polynomial t = gcd(p,q) satisfies t(w) = 0. 
Thus deg(t) > deg(p). Since t|p, it follows that t and p are scalar multiples 
of one another. Since p and ¢t are monic, it follows that t = p. Hence, p also 
divides q. 

Suppose that p is not irreducible over Q|2], say p = gr where g and r 
are non-constant polynomials in Q/z]. But then 


0 = p(w) = q(w)r(w). 
So either g(w) = 0 or r(w) = 0. But this is impossible, as they have smaller 


degree than p, which is the polynomial of smallest degree in Q[z] with w as 
a root. Hence p must be irreducible. a 


This immediately yields a powerful test for irrationality of algebraic 
numbers. 


'This variant of Cohn’s irreducibility criterion is due to Murty [27]. 
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6.6.3. Corollary. If w is a root of an irreducible polynomial p in Z[a] of 
degree at least 2, then w is irrational. 


Proof. From the hypothesis, it follows that p is the minimal polynomial of 
w (up to a scalar). The minimal polynomial of a rational number r is x —r, 
which has degree 1. So w is irrational. | 


6.6.4. Example. If |n| > 1 is square free, the polynomial «* — n is 
irreducible by Eisenstein’s criterion. Just take any prime p dividing n, and 
note that p divides all the zero coefficients, and p” does not divide n. So 
z* — » is the minimal polynomial of %/n. This gives another proof of the 
irrationality of W/n. 


6.6.5. Example. Let w = V/3 — V2. Notice that 
3 = (w+ V2)? = w? + 3V2w? + bw + 2V2. 
Hence we may compute 


(w? + 6w — 3)? = (3V2w? + 2V2)? 
w® + 12w* — 6w? + 36w? — 36w + 9 = 18w* + 24w? + 8 
w° — 6w* — 6w? + 12w? — 36w +1 =0. 


From the rational roots theorem, the only possible rational roots of the 
polynomial p(x) = «° — 6a* — 6x? + 122? — 362 +1 are +1, neither of which 
works. 

In fact, p is irreducible in Q|z]. By Gauss’s Lemma, it suffices to show 
that p is irreducible in Z[z]. To see this, reduce it mod 3. The polynomial 
factors as 


p= (a#7+1)? (mod 3) 


and x? + 1 is irreducible in Z3[z] since it has no roots in Z3. So if p factors 
in Z|x], it factors as a quadratic times a quartic. The quadratic must be 
x? + 3ax + 1. There are two ways to proceed, and both are computational. 
One is to write down a general quartic, multiply it by 7? + 3ax + 1, and set 
it equal to p. Then a calculation shows that the equations can’t be solved. 
Since the coefficients of x and x° are forced, simple conditions on a and the 
coefficient b of x? lead to a contradiction. Alternatively, we can factor p 
mod 7 as 


p(x) = (x? — 2a? — x — 2)(a3 + 22? — 2 +3) (mod 7). 


You can check quickly that neither cubic has a root in Z7, and thus they are 
irreducible. This shows that any factorization in Z[z] must be into cubics. 
This is incompatible with the factorization mod 3. So p is irreducible. 
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It is a non-trivial fact that if wu and v are algebraic numbers, then u+ v, 
uv and (when v # 0) w/v are all algebraic numbers. This will be proven in 
Theorem [6.9.3 


Exercises 


1. Show that sin(1°) = sin(7g5) is algebraic. 
2. (a) Find a polynomial p in Z[z] with /3 + V5 as a root. 
(b) Hence prove that J/3+ V5 is irrational. 
(c) Suppose you have calculated that p factors modulo 3 as (« + 1)°, 
and modulo 5 as (x? + 2)°. Show that p is irreducible. 


Find the minimal polynomial of VO eee 


4. Let F be a field and let p(x) = ag t+aix+...+anx” and g(x) = an + 
An—14 +...+agx" belong to Fiz]. If apa, 4 0, what is the relationship 
between the roots of p and the roots of gq? Hence conclude that if @ is 
algebraic over F, then so is 1/a. 


5. (Primitive Element Theorem) Let a and £ be algebraic numbers. 
Recall the definition of a field generated by an element given in Exercise 
[6] of Section [6.1] Let f(x) € Q[z] be the minimal polynomial of a and 
let g(x) € Q|a] be the minimal polynomial of /. 

(a) Prove there exists c € Q such that @ is the only common complex 
root of g(x) and h(x) = f(a+c(B—2)). 

(b) Let y=a+c{. Prove that gcd(g,h) = w(x — B) € Q(y). 
HINT: Use Exercise [6] from Section 

(c) Prove that a, 8 € Q(y) and conclude that Q(a)(8) = Q(7). 

(d) Prove that if a1,...,@, are algebraic numbers, then there exists 6 
such that Q(a1)(a2) --- (an) = Q(d). 


6.7. Transcendental Numbers 


A complex number which is not algebraic is called transcendental. In this 
section, we will establish that various complex numbers are transcendental. 
This problem has a long history. Liouville showed that certain numbers 
were transcendental in 1851. However, his methods did not apply to many 
naturally occurring numbers, such as 7 and e. In 1873, Hermite showed that 
e was transcendental. And in 1882, Lindemann generalized his argument to 
show that any non-trivial sum 


n 
) a;e% 
i=1 
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is never 0 if the a; # 0 are algebraic, and the (6; are distinct algebraic 
numbers. This means that 7 is not algebraic because 


e+e" =0, 


As 0, 1, and i are all algebraic, 7 must be transcendental. In 1934, Gelfond 
and Schneider proved that a? is always transcendental if a 4 0 or 1 is 
algebraic, and ( is an irrational algebraic number. In general, these results 
are very difficult. We will take a look at the results of Liouville and Hermite. 

Liouville’s result is based on the fact that irrational algebraic numbers 
cannot be approximated too quickly by rational numbers. This is made 
precise in the following theorem. 


6.7.1. Theorem. Suppose that w is a real root of an irreducible polyno- 


mial 
d 
ola) = Spi 
i=0 


in Q{x] of degreed > 1. Then there is a positive number 6 > 0 so that for 
every rational number ¢ with a,b € Z relatively prime, we have 


jo 52 


Proof. We will assume that p has integer coefficients, because this can easily 
be achieved by multiplying p by a large integer. Let 


d 


M= max |p'(x)| < ilp;|(|w| + 1)*~+ < 0. 
imac, (@)] < Da tl +2) 


The next observation is the key idea. The number b4%p(#) is a non-zero 
integer. It is an integer because 


d 
b"r() es So pid. 
i=0 


Since p is irreducible, it has no rational roots. Thus b“p(#) # 0. A non-zero 
integer has modulus at least 1. Hence 


PCG) | = br“ 


Now apply the mean value theorem. Suppose that |¢ = w| <1. Then 
there is a real number c between w and ¢ so that 


P(F) = P(z) — p(w) = p'()(w- 5). 


Hence 
p(s)| 1 
p'(c)| ~ M|b|4 


eal] 
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If we set 6 = min{1, M~'}, the desired formula holds. The hard part 
was done in the previous paragraph for fractions close to w. The remaining 
case, when |w — ¢| > 1, follows since 1 > 6/|d|?. a 


6.7.2. Example. (Liouville numbers) Let g > 1 be an integer, and 


define 
w= iy ee 


k>1 


Then w is transcendental. To see this, first observe that the base-q expansion 
of w is given by a sequence of 1’s and 0’s with a 1 in the k!-th decimal place; 
since this is a non-repeating sequence, w must be irrational. To prove w is 
transcendental, we may therefore apply Theorem [6.7.1] Let b, = q™ and 


n 


—— q@ et = S- gee. 
k=1 


k=1 


Notice that 


=| S- c* Lg nly gi < Qq7 (nt)! 


k2>n+1 j20 


Consider any positive integer d. Then for all n > d, 


an 
wi—-— 
b 


n 


Since this tends to 0 as n tends to infinity, there is no integer d and positive 
6 such that 


ele 


for all fractions. By Theorem [6.7.1] w cannot be algebraic. 
For example, w = es 10-™! = 0.1100010000000000000000010... is 
transcendental. 7 


Now let us consider the much more difficult task of showing that e is 
transcendental. This proof has been simplified over the years, but perhaps 
it will seem rather mysterious because so much of the ‘scaffolding’ has been 
removed in order to make it short. The proof uses calculus, not surprisingly, 
since it is in calculus that properties of e are developed. In particular, we 
use the fact that # (e*) =e"; 


6.7.3. Theorem. e is transcendental. 


Proof. Suppose, to the contrary, that there are integers ao,...,@, so that 


Goa bage cde" = 0: 
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We may assume that a,a9 # 0. For any large prime p >> max{|ao|,n}, 
consider the polynomial 


f(z) = ee pe — x)P(2—a)?...(n— 2)? 
K 
= ae + higher order terms = oot 3 fis 


* k=p—-1 


where kK = (n+ 1)p—1 is the degree of f. Notice that the coefficients fy 
are integers. 

We need information about the values f)(i) for integers 7 > 0 and 
0<i<n, where f% means the j-th derivative of f. Notice that for j > p, 


a =>° k(k — 1)(k - fat sot as) oe 


: (;)s0 = 1)... (p) fae"? 


kaj 
This polynomial has integer coefficients which are multiples of p. Hence 
f2(@)=0 (mod p) for j>p, ieZ. 


Now f has a zero of order p at each integer 1 <i <n. So each 7 is also a 
root of f) for 0 < j < p—1. (See the exercises.) Hence 


fIG)=0 for 0<j<p-1,1<i<n. 
Similarly, since f has a zero of order p — 1 at 0, 
f(0)=0 for 0<j <p -2. 
Finally, there is one term which is not a multiple of p, 
f?-Y(0) = (n!)? 40 (mod p). 


The next trick is to introduce the polynomial 
K 
F(a) =f. 
j=0 
From the previous paragraph, we see that 


F(t)=0 (modp) for l<i<n 


and 
agF'(0) = ao(n!)? #0 (mod p). 
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Since aj = —aye — age” — ... — ane” 
0 ’ 


04S aF(i) =) > ai(F(i) — e'F(0)) 
=) q=1 


= S- axe (e'F (i) - e° F(0)) (mod p). 
i=1 


Now it remains to estimate the size of this non-zero integer. Since 
deg(f) = K, we have f+) = 0. A routine calculation shows that 


a K K 
<(e*F(@)) Sa g2 ys fH +e - fOTD = —e-* f(x) 
j=0 


j=0 


By the mean value theorem, there are real numbers c; € (0,7) so that 


le F@) — &FO)| = «| 5 (€*F(@)) (cd)] = ie Ile) 
nKt 
< n max |f(2)| > (p_ 1)! 


The last estimate comes from (p—1)!|f(x)| = 2?-!(1—a)P---(n—a)P < n¥. 
Let A = maxo<j<n |a;|. Then one can estimate 
= gp Ww girl 
| S- aje'(e*F(i) — °F (0))| < S- Ae yh 
: é p—1)! 
i=1 i=l 


Aen +2 Ane™(n™+1)p 
(p—1)!  (p—1)! 
So the idea is to choose a prime p so large that this fraction is less than 1. 


If this fraction is denoted as B,, we see that for p > Int, 
Bois 2 nrtl s 1 
By p 2 
Thus, by the ratio test, 
lim B, = 0. 
poo 


Choose the prime p so large that B, < 1. However the left-hand side repre- 
sents a non-zero integer. Clearly, this is contradictory. 
Therefore e does not satisfy any algebraic equation over Q. | 


Exercises 


1. Show that >7,31 2-”" is transcendental. 


2. (a) Prove that if a is transcendental and q € Q\ {0,1}, then a? is 
transcendental. 
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(b) Give an example of a transcendental number a and an irrational 
number q such that a? is algebraic. 


3. (a) Show that if p(w) = (x — a)%q(x) is a polynomial in R{z], then f 
has the form (a — a)4~Ir(x) for all j < d. 
(b) Moreover, if g(a) 4 0, show that r(a) 4 0. 

(c) Show that a root a of p(x) is simple if and only if gced(p, p’)(a) 4 0. 


4. Show that if @ is transcendental and 6 ¥ 0 is algebraic, then a+ 6, af 
and a7! are all transcendental. 


5. If 0 < a, < 9 are integers for k > 1 and infinitely many are non-zero, 
then w = 7,5, @n10-™ is transcendental. 


6. (a) Show that if gcd(a,b) = 1, then |/15— ¢| > OE: 
(b) If n is a positive square free integer, find a C' so that | /n— a> oH 


6.8. Sturm’s Algorithm 


Recall the factorization theorem for real polynomials. Define the dis- 
criminant of a quadratic polynomial p(x) = ax? +bx+c by A(p) = b?—4ac. 
This theorem can be restated as: 


6.8.1. Theorem. The irreducible polynomials in R[x] are the linear poly- 
nomials, and the quadratic polynomials with negative discriminant. The 
roots of irreducible quadratic polynomials are a conjugate pair {a,a} of non- 
real complex numbers. 


As in Exercise or Lemma[7.8.6] we may test for multiple roots by 
computing gcd(p, p’). Moreover, all the roots are simple roots exactly when 
gcd(p, p’) = 1. (That lemma may be read independently of the rest of Chap- 
ter [7] It is easier in the case of the reals, and other fields of characteristic 
0, because the derivative of a non-constant polynomial must be non-zero.) 

We now describe an algorithm known as Sturm’s Algorithm for count- 
ing the number of real roots of a real polynomial with simple roots in any 
interval. The key is the Euclidean algorithm with a special sign convention. 

Start with a real polynomial p(x) with simple roots. Set po = p and p, = 
p'. Apply the Euclidean algorithm by repeated use of the division algorithm. 
Recall that dividing p; into p;_; yields a quotient a; and a remainder which 
we call —pj+1, so that 


Pi-1 = Gpi — Pi+1- 
Since the gcd(p, p’) = 1, this procedure eventually terminates with the rela- 
tion 
Pn—1 = AnPn — 0 


where p,, is a scalar (since it is a scalar multiple of gcd(p, p’) = 1). 
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For each real number a, consider the sequence 


po(@); Pi(@),-+-sPn—1(@); Pn (a). 


We say a sign change occurs at p; and a, if p;(a)pj41(a) < 0; ie., p;(a) is 
positive and p;4i1(a) or vice versa. We also say a sign change occurs at p; 
and a if pj_1(a) > 0, pi(a) = 0, and pi41(a) < 0, or if pj_1(a) < 0, pi(a) = 0, 
and pjii(a) > 0. In fact, the proof below will show that if p;(a) = 0, then 
pi-1(@)pi41(a) < 0; so there is always a sign change at p; and a. 

If a sign change occurs at p; and a, we write y;(a) = 1; otherwise we 
write yi(a) = 0. Let 


x(@) = x0(@) +++ + Xn-1(@); 


in other words, x(a) is the total number of sign changes in the sequence 
po(a), pi(a), soa Pn(a). 


6.8.2 Sturm’s Theorem. Let p(x) € R[x] be a polynomial with simple 
roots. Then the number of real roots in the interval [a,b] is x(a) — (0). 


Proof. Since gcd(p;, pi+1) = 1, the polynomials p; and p;+; have no common 
roots. If p(t) = 0, then 


pr—1(t) = axpe(t) — pezi(t) = —pe+i(t). 


From this, we can deduce that if p,(t) = 0, then pgiy are non-zero and 
of opposite signs in a neighbourhood of t. The constant function pp, never 
changes sign. Moreover, the roots of po are simple, so po changes sign at 
each root. 

Consider the effect on the function yo near a root t of pg. Note that a 
sign change in po does not effect x, for k > 1, as these quantities do not 
depend on pg. Since t is a simple root, po changes sign at t. Suppose that 
the sign change of po is from positive to negative. Then po is decreasing near 
t, and thus the derivative p, is negative near t. So there is a sign change 
from positive to negative between po and p; on the interval (t—¢,t), but no 
change (from negative to negative) on the interval (t,t + €) for small ¢ > 0. 
In other words, the function yo decreases by one at t: 


lim xyo(t¢ +¢€) — xo(t-—¢) = —-1. 

e0t 
Similarly, if po changes sign from negative to positive at t, then pg is in- 
creasing, and p, is positive near t. So again there is a sign change between 
po and p; on the interval (t — ¢,t), but no change on the interval (t, t + €) 
for small « > 0. So again the function yo decreases by one at t. 
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Next consider the effect of a zero t of p, for 1 < k <n. The resulting 
(possible) change of sign of p, may affect both x,_; and yz. As shown 
above, pz+1 are of opposite signs in a neighbourhood of t. Now 


gcd(pr—1, Pk) = 1 = gcd(pe, Pry), 


and thus pz has no roots in common with pz+1. Hence there exists e > 0 for 
which pp+1 are non-zero on [t — ¢,t +e] and px, has only ¢ as a root in this 
interval. So, we may assume without loss of generality that pp_, < 0 and 
Perit > 0 on [t —€,t +e]. Observe that a change of signs is possible for pz, 
at t. We make the following table 


| Pr—1 Pk Pk+l 


fel = 2 od 
Pelli ee . GOs 23 
peel S Bo -ab 


where we do not know the signs of p,(t + €). Changing the sign from — 
to + results in increasing yz_;(t + €) by 1 and decreasing x;,(¢ + €) by 1, 
leaving yx,_1(t + €) + x, (t + €) the same. Similarly a change from + to — 
results in decreasing y,_1(t + €) by 1 and increasing y;(t + ¢) by 1, again 
leaving y,_1(t + €) + xz (t +e) the same. Of course, if the sign of pz, does 
not change, this also has no effect on y(t+<¢). A sign change in pz does not 
affect yj; except for 7 = k—1 and k. Therefore regardless of these signs, we 
see y(t —€) = x(t + €). 

The theorem now follows. Our above analysis proves that (a) — y(b) = 
xo0(a) — xo(b). So if x(a) — x(b) = n, this must be a result of a decrease of 
1 in the value of xo at each of n zeros of po between a and 6. | 


6.8.3. Example. Consider the polynomial p(x) = 2° —3~—1. One checks 
that gcd(p, p’) = gcd(a° — 32 — 1,524 — 3) = 1, so that p has simple roots. 
Then 


pi(x) = 524 — 3 


x 12 
p2(x) = 5 Pilz) — p(x) = ee aa 
52 53 54 5° 
p3(x) = (a ip” +733 ryt) P2(2) PAA) 
4435 = 59 
— 494 > 0. 


Consider the following table of signs. 

This chart shows that there are three real roots. One lies in each of the 
intervals (—2,—1), (—1,0) and (1,2). We could refine this by checking the 
points —1.9,—1.8,...,—1.1, etc., to get more detail. 


Exercises 


1. Use Sturm’s algorithm to find the number of zeros of «7 — 7x? + 8. 
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Pp Pi P2 P3 
g\e°—38¢—1| 52*-3 122 +5 1 x 
—0o — + — + 3 
—2 — + _ + 3 
—1 + + — + 2 
0 - - + + 1 
1 a + a tie 1 
2 + + + + 0 
+00 + ~ + + 0 
TABLE 6.8.1. sign changes 
Use Sturm’s algorithm to show that 2° + az + b has three real roots 


(counting multiplicity) if and only if 
A := —4a° — 27? > 0. 


Remember to deal with the case of repeated roots separately. 


Solve the previous two exercises using calculus. (For simple polynomials 
like these, calculus is easier.) 


Locate all 7 roots of «7 — 259x° — 510x4 + 2x3 — 518x — 1020 within 0.5 
using Sturm’s algorithm. 


Use Sturm’s algorithm to locate all real roots of 2° — 523+ 2a —1 up to 
an error of 0.1. 


If f € R[z] has repeated roots, explain how to factor f into a product 
of polynomials with simple roots. 


6.9. Symmetric Functions 


Consider the polynomial 


n 


[[@-w) =22-GQityt...tynjar tt... yy... 
G1 


= 2" — Pr(yi,yo,---)Yn)a™ | +... + Palys, ya, ---5 Yn) 


n 
= ght S(-1)' Pil, Noises a5 Yn a". 
i=1 
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The coefficients of x’ are special polynomials in {y1, yo,..., Yn}- 
Py = a = yityot...+Yn 
Py = Vig YY; = yryet+ yiy3 +... + Yn-1Yn 
Pr = eee Yair Vig +++ Vix 
Ph = [Viiv = Y1Y2---Yn 


The values of these polynomials are not changed if the y;’s are per- 
muted. In general, a function of several variables is called symmetric if 
it is invariant under permutation of the variables. That is to say, for every 
permutation 7 of {1,2,...,n}, 


F (Ym(a)s Yr(2)>- . Uta) = f(y, ya, - : +: Yn): 


Moreover, each of these polynomials is homogeneous. A polynomial 
p €Fly1,.--, Yn] is called homogeneous of degree k if 


p(ty1, ty2,...,tyn) = t*p(y1, yo, ---, Yn) for teEF. 


Notice that P, is homogeneous of degree k for 1 <k <n. 

The functions P|, P2,..., Pn are called elementary symmetric poly- 
nomials. The rather surprising fact is that every symmetric polynomial in 
n variables can be expressed uniquely as a polynomial in P,,..., P,. Let us 
look at an example. 


6.9.1. Example. For n = 3, the elementary symmetric polynomials are 


Pro=yt+yety3 
Pz = yry2 + y1y3 + Y2y3 
Pz = y1y2ys- 


Consider the symmetric polynomial 
3 
p=2) y8-35_ yPyj + L2yryoys 


i=l iFj 


= 2(y3 + y3 + y3)—3(yzye + ytys + yey + yous + ysyi + y3y2)+1l2yy2¥3 


This is perhaps the natural way to write down a symmetric polynomial, 
by collecting together all monomials of the same type. So for n = 3 and 
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polynomials homogeneous of degree 3, there are the three polynomials 
3 
q = be yi 
i=1 
a= So uy; 


tA 
93 = Y1Y243- 
So p = 2q1 — 3q2 + 1293. 
Let us compute the symmetric polynomials homogeneous of degree 3 
which can be obtained as monomials in P,, P2 and P;. They are 


P2 = (yt+yo+y3)? = 14+ 3q@ + 63 
PiPy = (yi t+yot+y3)(yiy2 + y1y3 t+ yoy3) = G+ 393 
P3 = yiyoys = 3 


Notice that only P? contains the term y?, and so is the only one which 
requires q; in its expression. After subtracting oP from p, the polynomial is 
a combination of gz and q3. Of the remaining two terms, only P,P) contains 
the term y?y2, and hence requires q2 in its expression. So a multiple of P; P» 
can be subtracted off leaving a multiple of g3 = P3. 

We can use vector notation to simplify the calculation involved. Since 


(2, —3, 12) = 2(1, 3,6) — 9(0, 1,3) + 27(0, 0,1), 
we obtain the relation 


p = 2P? — 9P, Py + 27P3. 


6.9.2. Theorem. Every symmetric polynomial in n variables with coeffi- 
cients in a field F can be expressed uniquely as a polynomial with coefficients 
in F in the n elementary symmetric polynomials. 


Proof. The example basically explains how to proceed in general. We pro- 
ceed by induction. Given a symmetric polynomial p(y1, y2,---, Yn), let m be 
the largest degree of any monomial in p. Choose the term of degree m so 
that the power of y; is as large as possible, and after that, the power of y2 
is as large as possible, and so on. Thus p contains a term 


ki ke k 
OY Ys? 2,” 


where ky > kg >... > ky and ky tkho+...+ky =m. Call this the ‘largest’ 
term in p. 

We assume the induction hypothesis that the theorem holds for sym- 
metric polynomials of lower degree, and for polynomials of the same degree 
such that the largest term 


by ye oe ys” 
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precedes that of p in the lexicographic order on the exponents. That is, we 
Say that igetegty) Se Bigees hp) 1 a Shi Or 4 =; dor 1: 4. = apa 
Fig < Kio: 

The idea is to write down the monomial in P;,...,P, which has the 
same largest term and subtract off an appropriate multiple. It is not too 
hard to see that this polynomial is precisely 


— pki-kez pko—-k3 k 
P = phi—ka plo-ks | phen 


This is because the ‘largest’ term of P is the product of the ‘largest’ terms 
of each factor, namely 


ae .. (yry2---Yn)*. 


Indeed, the exponent of y; in this product is 
(ki — kina) + (Riza — Rite) +...+ (hn-1 — kn) + kn = ki. 


Now the polynomial p — aP has a smaller ‘largest’ term. So by the 
induction hypothesis, it can be expressed uniquely as a polynomial in the 
elementary symmetric polynomials. Adding the monomial aP to this yields 
a polynomial expression in P,, ..., P, for p as well. This expression is unique 
since there was a unique choice, aP, of a symmetric function with the same 
largest term as p and having removed that, there is a unique expression for 
the remainder. | 


yiy2)*2—*8 , 


The most important use of symmetric functions is based on the fact 
that the coefficients of a polynomial are precisely the elementary symmetric 
functions of the roots. This should be clear from their definition, but it 
bears repeating. The monic polynomial with roots 71, ..., Tn is 

n nm 
p(x) = [[@ —7r)=2"+ SO(-1) iris ras anata 
i=1 i=1 
In particular, if g = @”+qn_12"!+...+q,2+40 is an irreducible polynomial 
in Q/z], so that m1, ..., Tn are all algebraic conjugates, we see that the 
elementary symmetric functions of the roots are rational 


PAP sig st) = (—1)'qi. 


Thus Theorem [6.9.2] implies that every symmetric function of these roots 
with coefficients in Q is rational. 
This provides one way of proving the following result. 


6.9.3. Theorem. The algebraic numbers form a field. 


Proof. It must be shown that if a and 6 are algebraic numbers, then so are 
a+, a3 and 1/a. It was shown in section[6.6] exercise/4]that the reciprocal 
of algebraic numbers are algebraic. The method for sums and products are 
similar. So only sums will be done here. 
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Let p and q be irreducible polynomials in Q{a] with a and £ as roots. 
Let aj, ..., @m be the roots of p; and let 61, ..., Bn be the roots of q. It is 
enough to show that the polynomial with roots a; + 6; for 1 <i <m and 
1 <j <n has rational coefficients. However, we know that if P,, ..., Pam 
are the elementary polynomials in nm variables, then 


r(z)= [J I] @-%-4&) =>0C* Rat 62" 
1<i<m1<j<n k=0 


where P,(a; + 8;) is a symmetric function of the mn roots a; + 6;. Thus, 
thinking of this as a function of the {;, it is a symmetric polynomial with 
coefficients that are symmetric functions of the a; with rational coefficients. 
Therefore, these coefficients are themselves rational. Thus P,(a; + §;) is 
reduced to a symmetric polynomial in the 6; with rational coefficients. So 
it is a rational number. 

We conclude that r € Q/z], and hence its roots are all algebraic. In 
particular, a + £ is algebraic. | 


Exercises 


1. Express 2{+25+ es as a polynomial in the three elementary symmetric 
polynomials in three variables. 


2. Verify that if a and 6 are algebraic numbers, then so is af. 


Let a= V3 and 6B = V7. 

(a) Find a monic polynomial q of degree 6 in Q|z] with a+ 8 as a root. 

(b) Show that 7,4 := (—1)/a + w*8 are also roots of g for j € {0,1}, 
k € {0,1,2} and w = =H4iv3, 

(c) Check that Po(7o0,---, 71,2) is the coefficient of x? in gq. 


4. (Newton-Girard identities) Fix an integer n > 2. For each k > 0, 
let de = oy 4 ee which is known as a power sum. Let Po,..., P, denote 
the elementary symmetry symmetric functions in 71,...,%p. Since the 
dk are symmetric, they are expressible in terms of the P;. Prove the 
following explicit formula: 


k 


k (ri +--+ +7rp~— 1)! ‘ 
qk = (-1)"k S Aaa ae [[CP”. 
ryt2rot--+krp=k dere aaa i=1 


5. Let the complete Bell Polynomials be defined recursively by Bp = 1 and 


k 


k 
Bea Ginga): (8) Bester ++) Dp i) Li41- 


i=0 
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d\* kK gi 
By(x1,..., 2K) = (5) exp (>: af] 
i=1 : 


6. (Express elementary symmetric polynomials using power sums) 

Use the notation of Exercise [4] 

(a) Prove that the elementary symmetric functions are expressible as 
polynomials with rational coefficients in the power sums. Specifi- 
cally, prove 

(= ! ! ! 
Py = —7—Br(—an, — (1 )aa, —(2!)a3,---, —(& — 1)!6). 

(b) Conclude that every symmetric polynomial with rational coefficients 
is expressible as a polynomial in the power sums. 


Prove 


7. Prove that the elementary symmetric polynomials are not expressible as 
polynomials with integer coefficients in the power sums. 


6.10. Cubic Polynomials 


In this section, we will show how to use the power of symmetric polynomials 
to factor cubic equations in C[z]. It is nice to know that there is such a 
formula, although it is too complicated to be of much practical use. In par- 
ticular, even when all three roots are real, the formula still requires complex 
numbers. 

To illustrate the idea in a more simple setting, first consider a quadratic 
polynomial x? + ax + b. Let the two roots be r; and ry. We know that 


mtr. = Pi(ri,rez) = —-a 
rire =. Saigo! =. of 


( 
The symmetric function of the roots (rj — r2)? is given by 


i r9)? =(r+ r2)" —4rjrg = a? — Ab. 


Hence 
—a+ va? — 4b 
TY = 5((r1 + r2) + (ry = r2)) 7 
—a- 2_ Ab 
re = 3((r+1r2)—(m—72)) = > 


For cubics, the same kind of technique works, although it is a fair bit 
more complicated. The first simplifying step is to make a change of variables 
to eliminate the coefficient of z?. This is analogous to completing the square 
in the quadratic case. Suppose we are given a cubic polynomial 

w> + Aw? + Bw+C. 
Make the substitution w = « — A/3. Then we obtain a polynomial 


(2 — A/3)? + A(a — A/3)? + B(x — A/3)+C=2%+an2+b6 
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where a = B — A?/3 and b = C — AB/3 + 2A?/27. If we can find the roots 
£1, 2 and «xz of this cubic, then the roots of the original are w; = x; — A/3. 
The elementary functions in the x; are 


tl 
P2 
Ps 


The idea is to look 


= 21+%2+%3 = 0 
= %1%24+%1%2+02%3 = a 
= 2£1x9%x3 = —b 


for some ‘almost symmetric’ functions y; of the roots 


which are roots of y? = d for some d. We investigate the properties such a 
y must have. Let D represent a cube root of d and let w = e2*"/3 be a cube 


root of 1. Then 


y? — 


D® = (y — D)(y— wD)(y — wD). 


This suggests writing down the following functions of the roots 71, x2 and 
x3. (This change of coordinates is known as a discrete Fourier transform.) 


Y1 
Yy2 
¥3 
Z1 
22 
23 


We find that 


and 


= £1 + wrg + w2r3 


= Wt = wWXy+ wre + £3 
= wy = we, t+a2+wr3 
= 1 + wre + wx 
= wz — way + W%2+ X3 
= WZ] = wey+eto+ w 23 
ees ee: ae 
Y= Yo = ¥3 = Y1Y243 
3 3 3 
21 = 29 = 23 = £12223 


Y121 = Y222 = 9323. 


For convenience, write the subscripts mod 3 (so that x, means x1). A 
computation shows that 


3 3 
+ 3w ) 2 r544 ee ) oie 4 + 621 %2%3 
i=l i=1 


3 3 
L697 ) Uri + 3w ) T4041 + 62%1%2%3 
(=A i=l 
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Neither of these is symmetric, but their sum is, 


3 3 3 
yp + z =2 s” x = a ee ea —3 See + 12212273 
i=1 i=1 i=1 
3 
= (OBC i = a ae) = 610203) _ ea, + 12P3 
i=1 ram] aml 
3 
=o(P? _ 6P3) _ 9(( ya) (Se) = 3ur1#203) + 12P3 
i=1 ixj 


= 2P3 — 9P, Py + 27P3 = —27b. 


Notice the big advantage of simplicity in this formula occurs because P; = 0. 
Similarly, compute 


3 
Ya = 2% + (ow?) Ss" LiL; 


1<i<j<3 
- = x? — y LiL; 
1<i<j<3 
e 2 
= (> x) —3 ) LiL; 
i=1 1<i<j<3 


= P? — 3P) = —3a. 
Hence yj, y2, 3, 21, 22 and z3 are the roots of 
(X° — yf)(X? — af) = X° — (yp + 27) X? + (yz)? 
= X° + 27bX3 — 2743. 

This is a quadratic in X?, and thus it can be solved: 
—27b4 27(27b2 + 4a) 

5 . 
Because of the symmetry involved, we can let y; be any cube root 


3/ —27b + ,/27(27b + 4a) 
Y= 5 F 


Then z; = —3a/y;. So from the equations for y; and z1, we obtain 


Ke 


Yi + 21 = 22, — ©2 — 23 = 321 — Py. 
Thus the roots of the cubic x? + ax + b are 
y= yi /3—a/y1 
x2 = wy /3— wa/y1 
13= wy, /3 —wa/yi. 
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To get the roots of the original cubic, add —A/3. 


6.10.1. Example. Consider the polynomial x? — 7x + 6, which you can 
factor by hand using the rational root theorem, but has the virtue of being 
computable by hand in our formula. We have 


i ~27(6) + \/27(27(6)2 + 4-7") 
oe | a aa 


= */-81 + 3V 729 — 1029 


1/81 + 30V3i = 34 2V3i 


Now, for future convenience, compute 


a Hy 3-2V3i 


ym lm? 8 


Plugging this in to the formulae, we obtain 


3 49./31 + 3 = 2/3: 


O= _ ; 
3 

= (—1 + V3i)(3 + 2734) + (-1 — V31)(3 — 2V34) [ 
6 

ee 
6 


Even for such a nice cubic, the calculations are daunting. However, this 
formula has the virtue of providing a closed form, algebraic expression for the 
roots. For finding approximate values of the roots, numerical methods based 
on calculus are much superior. Those methods, however, do not provide 
exact solutions. 


Exercises 

1. Redo Exercise 2] from Section [6.8] without using Sturm’s algorithm. 

2. Find the roots of x? — 6x +9. 

3. Find the roots of x? — 152? + 60x — 54. 

A. Show that VV5+2-— VV5—-2=1. 

5. A sphere with outer radius r which is 1 cm. thick has the same volume 


in the shell as in the interior hole. Find r. 
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6. (Cubic resolvent) If f is a degree n polynomial with roots r1,...,7n, 
its discriminant is defined to be 


A(f) = [[ (i - 75)”. 


i<j 


The cubic resolvent of a quartic polynomial x4 + ax? + ba? + cx +d is 
defined to be the polynomial 2? — ba? + (ac — 4d)ax — (ad + c? — 4bd). 
The cubic resolvent plays an important role in solving quartic equations. 
Prove that a quartic polynomial and its cubic resolvent have the same 
discriminant. 


Notes on Chapter 6 


The formula for the roots of a cubic was discovered by the Italian mathe- 
maticians del Ferro and Tartaglia in early 16th century. Cardano and his 
student Ferrari learned Tartaglia’s method, and found a solution for the 
quartic. The formula for quartics is considerably more complicated than 
the cubic case. It was long an open problem whether such a solution could 
be obtained for arbitrary polynomial equations. In order for this to be the 
case, every algebraic number would have to be expressible as a combination 
of various k-th roots. However, in 1826, the Norwegian algebraist Abel pub- 
lished the first rigourous argument showing that there are polynomials of 
degree 5 which cannot be solved by repeated extraction of roots. 

Remarkable progress was made shortly after, in 1831, by the young 
mathematician Galois, who died in a duel when he was 20. Galois showed 
that one can study the roots of a polynomial by looking at the structure 
of the field obtained by adding all of the roots of this polynomial to the 
rationals. The set of all isomorphisms of this field onto itself forms a group. 
The structure of this group can be used to decide if a polynomial can be 
‘solved by radicals’, meaning that the roots can be expressed by extraction 
of roots. This is a very beautiful theory, and one of the landmarks of modern 
algebra. 

It was not until Viete in the late 16th century and Descartes in the early 
17th century that a good notation for polynomials was proposed. Stevin 
proved the Intermediate value theorem for polynomials, thereby showing 
that real polynomials of odd degree have a root. Descartes considered the 
graphs of polynomials. He found the rational root theorem and formulated 
his rule of signs, but did not publish his proof (as was common). He also ob- 
served that a polynomial of degree n has at most n roots. Newton showed 
that complex roots of real polynomials come in conjugate pairs. He also 
studied the symmetric functions of the roots and related them to the coef- 
ficients of a polynomial. 

The fundamental theorem of algebra was discussed in the notes to the 
previous chapter. 
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Gauss’s lemma comes from his early work in 1801. Eisenstein’s crite- 
rion dates from 1850. Sturm published his algorithm in 1829. It was the 
first effective algebraic algorithm for locating roots of a polynomial to any 
accuracy. In 1901, Kronecker published a set of lectures which includes a 
statement and proof of unique factorization of (rational or integer) polyno- 
mials into irreducibles. 

Liouville was the first to construct transcendental numbers in 1851. Her- 
mite showed that e is transcendental in 1873, and Lindemann showed that 
m is transcendental in 1882. 

The general algebra of polynomials can be found in various introductions 
to abstract algebra such as Artin [5]. Sturm’s theorem and Descartes rule 
of signs can be found in and [28]. Hardy and Wright is a good 
source for information on algebraic and transcendental numbers; also see 


Stark and Silverman [84]. See Gray for more about the history. 


Chapter 7 


Finite Fields 


This chapter contains a detailed study of finite fields. It tries to empha- 
size the dramatic parallels between the arithmetic of the integers modulo 
a prime and the corresponding arithmetic of polynomials modulo an irre- 
ducible polynomial. At the end of the chapter, we will obtain an algorithm 
for factoring integer polynomials efficiently on a computer, in contrast to 
the (apparent) difficulty of factoring large integers. 


7.1. Arithmetic Modulo a Polynomial 


If p is a polynomial in F[z], then it is possible to do calculations modulo 
p. As in the integer case, say that polynomials a(x) and 6(x) in F[z] are 
congruent mod p if p divides a— 6: 


a=b (mod p) ifandonly if p|(a—)). 


This yields a ring of equivalence classes, analogous to the rings Z,,, called 
F[z]/(p). The point is that addition and multiplication of equivalence classes 
are well defined because of the following proposition. The proof is left as an 
exercise. (Compare with Proposition [2.1.1}) 


7.1.1. Proposition. Let p, a; and b; be polynomials in F[x] such that 
aj=a2 (modp) and b,=b2 (mod p). 


Then 
(1) a, +6; =aq+b2 (mod p). 
(2) ayb; = agb2 (mod p). 
This means that addition and multiplication of equivalence classes can 
be defined by 
[a] + [b] = [a+b] and [al[b] = [ad]. 
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One may verify the various properties of a commutative ring, such as as- 
sociativity of addition and multiplication, and the distributive law, because 
these properties hold for the ring F[z]. 


7.1.2. Example. Consider the ring S$ = R[z]/(2? +1). By the division 
algorithm, every polynomial g is equivalent modulo x? + 1 to its remainder 
after division by x? +1, which is a linear polynomial a + bx. Since the only 
linear polynomial divisible by x? + 1 is 0, each linear polynomial belongs to 
a different equivalence class. Thus 


S = {[a+ bz]: a,b € R}. 
Addition and multiplication are given by 


[a + ba] + [e+ da] = [(a +c) + (6+ d)a] 


[a + ba][e + dx] = [ac + (ad + bc)x + bdx?] 
= [(ac — bd) + (ad + bce)z]. 


A moment’s reflection will show that this corresponds to the rules of multi- 
plication in the complex numbers C. 

This correspondence is not a coincidence. Notice that +7 are the two 
roots of the irreducible polynomial x? +1. In S, the equation X?+1 = 0 has 
the solution [x]. That is why [2] takes the place of 7. We can define a map y 
from R[z] into C by y(q) = q(t). One may check that y preserves addition 
and multiplication. Moreover, y(q) = 0 if and only if 7 is a root of g. By 
Theorem [6.6.2] it follows that q is divisible by the minimal polynomial of i, 
namely 2? +1. Thus y(q) = 0 if and only if [qd] = 0 in S. So there is an 
induced map ¢: S — C given by ¢£((g]) = g(é) as in Lemma The 
point of the previous discussion is two-fold. First ¢ is well defined because 
M1 = q (mod x? + 1) implies that q(i) = qo(i). Secondly, @ is one-to-one 
because qi (i) = q2(i) implies that x? + 1|(q1 — q2); ie. G1 = qo (mod x? + 
1). So ¢ maps S' one-to-one and onto C, and preserves all the operations 
(addition, multiplication, 0, 1). Therefore ¢ is a ring isomorphism. (Recall 
Definition [L.1.2}) This means that they represent the same mathematical 
object. 

The complex numbers form a field. The ring isomorphism ¢ can be used 
to show that S is also a field. For any s ¥ 0, let z = G(s). Since @ is 
one-to-one, z #0. So z7! EC. Since ¢ is onto, t = @-!(z~1) € S and 


st= G2) e'(27) =F MHL 
That is, t = s~!. Therefore S is a field. 


The polynomial x? + 1 is irreducible in R[x], and the quotient ring S 
turned out to be a field. This is completely analogous to the fact that Z,, is 
a field if and only if n is prime. 
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7.1.3. Proposition. F{z]/(p) is a field if and only if p is irreducible. If 
p is reducible, then F[x|/(p) has zero divisors. 


Proof. If p is not irreducible in F[z], then it factors as p = ab where both 
a and b have positive degree. Since p does not divide either a or b, the 
equivalence classes [a] and [b] in F[z]/(p) are non-zero. However, 


So F[z]/(p) has zero divisors. 

On the other hand, if p is irreducible, and [a] 4 [0], then gcd(a, p) = 1. 
Thus by the Euclidean algorithm for polynomials [6.2.4] there are polynomi- 
als s and t in F{z] so that 1 = as + pt. Hence 

[a][s] = [1 — pé] = [1]. 
Therefore, all non-zero elements of F{z]/(p) are units, and so it is a field. 

The significance of this construction comes from the fact that it provides 


a method for constructing a bigger field containing F in which p had a root. 
Let us record this as a theorem. 


7.1.4. Theorem. [fp € F[z] is irreducible, then the field G = F[x]|/(p) 
contains F as a subfield, and p has a root in G. 


ProorF. Notice that F sits inside G as the constant polynomials [a] for 
a € F. The element [2] is a root of p in G because 


d 
p([z]) = >) pilz}’ = [p(2)] = [0]. a 
i=0 


You may have noticed that modding out by p makes [2] a root by fiat. 
This is precisely the rationale for doing this operation at all. 


Exercises 


1. Prove Proposition [71.1] 
a) Show that x° + 7x? — 7 € Z— a] is irreducible. 
b) Find the inverse of [x? + 32 — 1] in Q[z]/(x° + 7x? — 7). 


a) Show that x? +1 is irreducible in Z7[z]. 
b) Find the smallest integer k so that [2a7]* = 1 in Z7[x]/(a? +1). 
c) How many elements are there in Z7[x]/(a? + 1)? 


a) Show that if d is a square-free positive integer, then 


Q[Vd] = {r+sVd:r,s € Q} isa field. 
(b) Express this field as a quotient ring of Q[z]. 
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5. (a) Find an irreducible polynomial p € Z[z] with 3+ W7 as a root. 
(b) Show that Z[z]/(p) is a ring contained in the field Q/z]/(p). 


Show that Z,[z]/(2* + 23 +2 +4 1) is not a field for any prime p. 
7. (a) Show that if a1, a2, pi, p2 € Fla] and gced(p1, p2) = 1, then the system 


q=a, (mod p;) 
q=a2 (mod py) 


has a unique solution (mod pp). 
(b) Prove the Chinese Remainder Theorem for arithmetic modulo poly- 
nomials; i.e., if gcd(p;, p;) = 1 whenever 1 <i < j < n, show that 


q=a, (mod pj) 


q=a, (mod p,) 
has a unique solution modulo pjp2--- py. 


8. Suppose that a1,a2,pi,p2,d € Fla] and gcd(pi,p2) = d and d ¢ F. 
When does the system 


q=a, (mod p;) 
q=a2 (mod pz) 


have solutions? What can you say about these solutions? 


7.2. An Eight-Element Field 


Consider the polynomial p(x) = 2? + 2+ 1 in Zo[z]. It is irreducible 
because it has no roots, and has degree 3. Let us investigate the field 
Fg = Zo[x]/(2?+2+4+1). By the division algorithm, the different equivalence 
classes are again given by all polynomials of degree less than p, namely the 
quadratic polynomials a + ba + cx?. There are 2 choices for each a, b,c, so 
there are 8 = 2? elements in Fg. The multiplication rules are given by the 
following table. 

It is apparent from this table that every element has an inverse. For 
example, [x + 1][r? + 2] = [1]. It would be difficult to find the compatible 
addition and multiplication tables for a field of 8 elements without this 
construction. 

By Theorem [7.1.4] we see that the polynomial X° +X +1 has a root [z] 
in Fg. In fact, it has three roots. A calculation using the table above shows 
that 


({z]?)? + [2P +1 = ([2"][2? + 2) + [2"] +1 
= [2?+1+27+1)= (0 
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0 1 x r+i1 x r+1 eta | a?+a41 
0 0 0 0 0 0 0 0 0 
1 0 1 x r+i1 x +1 e+e | a?ta4l 
x 0 x x ee+a z+1 1 v+at+l e+l 
z+1 0 at+l a +e e411 | a?+a+1 x 1 x 
ax 0 x r+1 e+atl| ate Z x41 1 
x+1 0} «741 1 x x et+e+l z+1 xr +e 
ete 0) «+a |a%ta+1 1 a? +1 z+1 x x 
x? +ae+1 | 0 | 2?+a+1 +1 x 1 arte x +1 


TABLE 7.2.1. Multiplication table for Fg 


So this yields the factorization 
X°+X+1=(X — [2])(X — [2"])(X - [27 +2). 

Now consider the powers of [x] in Fg. We have [2], [x7], [x?] = [x + 1], 
[a4] = [x? +2], [2°] = [22 +2+11), [x®] = [22+], and [x’] = 1. So the powers 
of [x] run through all the 7 non-zero elements of Fg. This is a primitive root! 
Notice that for any non-zero a € Fs, there is a k so that a = [x*]. So 

a =e S12, 
So a is a root of X’ —1=0. Since 7 = 8 —1, this is a variant of Fermat’s 
little theorem for Fg. We will establish this for all finite fields. 

This means that X® — X has 8 distinct roots in Fg. So it factors into 
linear terms in F¢: 

xX®_ x= || (x-a). 
ac€Fg 
Let us factor it in Z2[X]. A simple calculation shows that 


X8—-X=X(X-1)(X8+X4+1)(X34+ X27 4+1). 
The two cubics are irreducible in Z2|X] because they have no roots in Z. 


We saw above that X° +X +1 factors into three linear terms in Fg[X]. We 
now also can factor 


Xe X71 = (KX = [ee $1) (X= [1 = e+e +1). 


It turns out that there is only one field of order 8. This may seem 
surprising since there is a second irreducible polynomial of degree 3, namely 
x? +a2+1. It turns out that this other choice leads to an equivalent 
field, in the sense that there is an isomorphism of one onto the other, as in 
Example [7.1.2] Consider the other 8 element field, G = Ze[y]/(y? + y? +1). 
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As an exercise, write out the multiplication table for G. Notice that [y] is 
a root of X3+ X2+41 in G. But Fs also has roots of this polynomial; for 
example, [x + 1] is a root. 

Consider the map from G to Fx given by 


(la + by + cy”]) = [a+ (@ +1) + cla +:1)"] = [(a + b+ 6) + ba + cx”). 
This map is easily seen to be a bijection, for 
(lar + bry + cry?]) = p([a2 + bey + cry") 


implies that b; = be, cy = co and a, +b, +c, = ag +b2+ 2, whence a, = ag. 
More significantly, y preserves addition and multiplication. 


g([ar + bry + cry*]) + p([a2 + bay + c2y"]) 
= [(a, + by +c) + bye + e127] + [(ag + bo +c) + box + cot” 
= [(ay + ag + by + bo + cr + cn) + (b1 + be) + (C1 + c2) 27] 
= y([(a1 + az) + (b1 + bay + (er + €2)y7] 
This show that y preserves addition. Multiplication is more subtle, and 
uses the fact that [y] and [2 +1] have the same minimal polynomial q(X) = 
X° + X*+1. Hence [y3] = [y? + 1] and [y4] = [y? + y+ 1] and likewise 
[(@ + 1)?] = [((x@ + 1)? + 1] and [(x + 1)*] = [(v + 1)? + (x +1) +1]. Thus 
(lar + bry + e1y7]) p([a2 + bay + coy") 
= [ay + by(x@ +1) + ey (x +. 1)7])([ag + bo(x + 1) + co(x + 1)?] 
= [aya2 + (a1b2 + agb1)(x + 1) + (b1b2 + aicg + a2c1) (a + 1)? 
+ (b1c2 + b2e1)(x + 1)? + c1e2(x + 1)*] 
= [aa2 + (a1b2 + agb1)(x + 1) + (b1b2 + aicg + a2c1) (a + ie 
+ (beg + b2e1)((z + 1)? +1) + crea((w@ +1)? + (2 +1) + 
= ~([ara2+(arbo+a2bi)y + (bjb2+a4c2+a2¢1)y" 
+ (bc2+bc1)(y? +1) + c1eo(y*+y+1)]) 
= —([a1a2+(ayb2+a2b1)y+ (b1b2+a1c2+a2¢1) y+ (dice +b2c1)y*+e1c2y']) 
= p([ar + bry + cry" J [a2 + boy + c2y”]) 


So we see that y is an isomorphism between these two fields of order 8. 


Exercises 


1. Construct the multiplication table for G = Za[y]/(y? + y? + 1). 
Construct the multiplication table for F = Zs3[x]/(a? + x — 1). 


3. (a) Factor X° — X in Z3[X]. 
(b) Factor X9 — X in Q[x]/(a° + 7x? — 7). 
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4. Show that F = Z3[x]/(x? + x — 1) is isomorphic to G = Zs[y|/(y? + 1). 


(a) Show that 2? + 1 and 2? +2 +4 4 are irreducible in Zy;[2]. 

(b) Factor X24 X + 4 in F = Zy,[z]/(x? + 1). 

(c) Construct an explicit isomorphism from G = Z1;[x]/(x? + x + 4) 
onto the field F. 


6. (a) Find all irreducible quadratics in Zg[x]. 
(b) Construct a 4-element field. 
(c)* Show that this list of four matrices with coefficients in Ze 


bol foal Ga) fo 


form a field under the usual addition and multiplication of matrices, 
modulo 2. 

(d)* Find an isomorphism between the fields that you constructed in 
parts (b) and (c). 


7.3. Fermat’s Little Theorem for Finite Fields 


In this section, we will show that certain results about modular arithmetic 
for Z, are valid for all finite fields. Moreover, the proofs in many cases are 
almost unchanged from the integer case. This will lead to strong structural 
results for finite fields. 


7.3.1. Proposition. Let p be prime, and let q(x) be an irreducible poly- 
nomial of degree d in Z,|[x]. Then the field Z,[x|/(q) has cardinality p*. 


Proof. This is just the observation that each [a] agrees with [r] where r is 
its remainder on dividing a by q. This remainder has degree at most d— 1. 
Conversely, two distinct polynomials r; and r2 of degree at most d—1 must 
represent different equivalence classes. This is because rj = rg (mod q) if 
and only if q divides r; — rz, a polynomial of degree at most d— 1. Since q 
has larger degree, this can happen only when r, — rg = 0. So ry = ro. 

It remains to count the number of polynomials of degree at most d— 1 
in Z,[x]. They can be written as ag +a,%...+ aq_ x?! where each a; is an 
arbitrary element of Z,. There are p choices for each coefficient a;. Hence 
there are p? choices for the different equivalence classes. | 


In order to show that all finite fields are of this type, we must develop 
various properties of finite fields. The first result is the analogue of Fermat’s 
little theorem. 


7.3.2. Theorem. Let F be a finite field of cardinality n. Then a®—! =1 
for everya #0 inF. 
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Proof. The proof is the same as in Zp. Define a map f : F — F by 
f(x) = ax. This map is one-to-one. To see this, notice that if f(x) = f(y), 
then 


0= f(x) — fly) =a(z—y). 


Since a # 0 and F has no zero divisors, it follows that x = y. Also, f(0) = 0. 
Thus f maps F* = F \ {0} into itself. A one-to-one function of a finite set 
into itself must also be onto. So multiplication by a merely permutes the 


units. Therefore, 
II L= II ax =a"! II L. 


x2ek* xek* xek* 


Dividing by the product of the units, we get a”~! = 1. | 


7.3.3. Corollary. Let F be a finite field of cardinality n. Then one can 
factor the polynomial X" — X in F[X] as X" — X = [| yeR(X — a). 


Proof. By the previous theorem, every a € F* is a root of X"~! — 1. Thus 
every element of F is a root of X” — X. This provides n roots for this 
polynomial of degree n. Hence it is a scalar multiple of [],<¢(X — a). Since 
the leading coefficient of both polynomials is 1, they are equal. 


7.3.4. Corollary. Let q be an irreducible polynomial of degree d in Zp|z], 
and form the field F = Z,[x|/(q). Then q divides a?" — x in Z|]; and q(X) 
factors into linear terms in F[X]. 


Proof. Let a = [x] be the known root of gq in F. Thinking of a as an 
algebraic element over Zp, we see that q must be the minimal polynomial 
of a in Z,[X] because it is irreducible. Now by Theorem for F and 
Proposition [7.3.1] we see that a is a root of XP" — X. So by Theorem [6.6.2] 
it follows that g(X) divides X”° — X in Zy[X], say XP" — X = q(X)r(X). 
By the previous corollary, X p* _ X factors into a product of linear terms. 
By unique factorization into irreducible polynomials in FLX], it follows that 
q(X) factors into a product of d linear terms in F[X]. a 


Exercises 


1. Find the analogue of Wilson’s Theorem for the product of all the units 
of any finite field. 


6 


Factor x!© — x into irreducibles in Z[z]. 


3. Let R be a finite integral domain. Prove that R is a field. 
Hint: Take a 4 0 in R and find 0 < m <n such that a” = a”. 
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“ ze —1 3 
4. Let p be an odd prime, and let g(x) = 2?-! —1—]]P_) (x —&) in Z,[z]. 
Show that degq < p—1 but q has at least p— 1 roots. Hence deduce 
Wilson’s theorem for Zp. 


7.4. Characteristic 


Now it is possible to count the number of elements in a finite field. The 
prime integer p in the following theorem is called the characteristic of the 
field. This is the smallest integer p such that the sum of p ones in F equals 0. 
If such a sum is never 0, say that the field has characteristic zero. Examples 
of fields of characteristic zero are Q, R and C. 


7.4.1. Theorem. Let F be a finite field. Then 
(1) There is a prime p so that pa = 0 for everyae F. 
(2) F contains a copy of Zp. 
(3) There is an integer d so that |F| = p?. 


Proof. First consider the elements of F given by 0,1, 2 =1+1,3=1+4141, 
and so on. This is an infinite list, and since these are all elements of the 
finite set F, the list must repeat itself. So there are sums 


k=14+...41=1+...4+1=m. 
—e—_-—"'" ——S—— 


k ones m ones 


subtracting yields 
O=m—-—k=14...4+1. 
Se 
m-—k ones 
Let p be the smallest positive integer such that the sum of p ones equals 


0. It must be shown that p is prime. If it isn’t prime, factor p = 7k where 
1<j,k <p. Then 


a ee en eee | Cees ets 
SY 


p ones j ones k ones 


Neither of these terms is 0, so F contains zero divisors, which is absurd. 
Therefore p is prime. Now for any element a € F, 


pa=at+...ta=a(l+...+1) =a(0) =0. 
eevq~Y ~_—S4 


pa’s p ones 


Let S = {0,1,...,p — 1} be the set of all possible sums of ones in F. 
Notice that this set is closed under addition and multiplication (because the 
product of two sums of ones is a sum of ones). Moreover, by the paragraph 
above, addition is calculated mod p. Clearly then, multiplication is also 
calculated mod p. So S is a copy of Z, in F. From now on, we will write an 
integer n to mean n (mod p) as an element of F. 
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The next problem is to find a way to represent the elements of F which 
will allow us to count them. The idea is to find a minimal list a; = 1, ag, 
..., aq in F so that every a € F can be expressed as a sum 


d 
) Nai ME Zp. 
7= 1. 


This is done recursively. If F is larger than Z,, choose some element a2 € F 
not in Zp. Then if {nj + ngaz : nj € Zp} is not all of F, choose a3 € F not 
in this set. Repeat this until a set aj = 1, a2, ..., ag is chosen so that 


j—1 
ea {a1 NA, > N14 E Zp} 
e Every a € F can be expressed as pe nia; for some nj € Zp. 


The important point of this representation is that every a € F can be 
represented as such a sum in exactly one way. If this were not the case, 
there would be two different sums with the same total: 


d d 
) Mia, = ) Nj{Aq. 
i=l i=1 


Subtracting yields 
d 


Simi — nj)a; = 0. 


i=1 


It suffices to show that if yy kja; = 0, then all the coefficients k; are 
0. If this were not so, let io be the largest integer so that ki, A 0. Then 
rearranging the equation and dividing by kj, yields 


to—1 


. -1 
Qing = —k;, kaj. 
i=1 


This contradicts the fact that no a; can be written as a combination of the 
earlier a,’s. 

It follows that each coefficient n; can be any element of Z,. Since dif- 
ferent choices yield different sums, there are p’ such sums. Thus F has p% 
elements. | 


It is worth remarking that the last part of this proof is not really a mys- 
terious one. If the reader is familiar with vector spaces and linear algebra, 
then the proof may be shortened considerably. Once F contains a copy of 
the field Zp, it follows that F is a vector space over Zp. If d is the dimension 
of F, then |F| = p?. Indeed, the set a1,...,aq is a basis for F as a vector 
space over Zp. 
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Exercises 


1. Show that if F is a field of characteristic 0, then F contains a copy of 
the rational numbers. 


2. (a) For the field Z3[X]/(a* +23 — 1), show that the set {1, [2], [x7], [x?]} 
serves the role of {a1, a2, a3, a4} in the proof of Theorem [7.4.1] 
(b)* If you know some linear algebra, find the matrix for the linear 
transformation Tp] = [xp] with respect to this basis. 


3. Let F be a field of characteristic p > 0. 
(a) Prove that (x +a)? = x? + a? foraecF. 
HINT: use the binomial theorem. What is (?) (mod p)? 


(b) Deduce that if a,b € F, then (a +b)?" = a?" +?" for every k > 1. 
4. Let F be a field of characteristic p. 

(a) Prove that GS) = (-1)* (mod p) forO<k<p-1. 

(b) Hence show that if a,b € F, then (a — b)?-! = ye ig a 
5. Suppose that G C F is a strict inclusion of finite fields of characteristic 


p > 0. Let |G| = p? and |F| = p®. Modify the proof of Theorem [7.4.1] 
(3) to show that dle. 


7.5. Algebraic Elements 


If a is an element of a field F of cardinality p’, then by Fermat’s Little 
Theorem, a is a root of the polynomial XP" — X in Z,[X]. Thus there is an 
irreducible factor q(X) of X?" — X such that q(a) = 0. This is the minimal 
polynomial of a, which is algebraic over Z,. Theorem [6.6.2] is valid for 
Zy as well as Q, and one can use the same proof verbatim replacing Q by 
Zy. Thus we conclude that if r(a) = 0, then g|r. We state this for future 
reference. 


7.5.1. Proposition. [fa is an element of a field F of cardinality p*, then 
a has a minimal polynomial q € Zp[X]. The polynomial q is a factor of 
XP =X PE Z,|X]| satisfies r(a) = 0, then q divides r. 


Starting with this element a in F, consider the set Z,[a] of all polyno- 
mials of a. This set is a subset of F which is closed under addition and 
multiplication because 

r(a)+s(a) =(r+s)(a) and r(a)s(a) = (rs)(a) 


for all r and s in Z,[X]. Say that a is a generator of F if F = Z,|a]. The 
following theorem explains what this subset is. 
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7.5.2. Theorem. Let a be an element of a field F of cardinality p*, 
with minimal polynomial q € Zp|X|. Then Z,[a] is a field isomorphic to 
Zp|X|/(q)- 


Proof. Define a map y : Z,|X] — Z,[a] by y(r) = r(a). We have just seen 
that 


g(r + 8) = (r+8)(a) = r(a) + s(a) = v(r) + 9(s) 
and 
(rs) = (rs)(a) = r(a)s(a) = v(r)9(s). 
So preserves addition and multiplication. 

The map y is not one to one. Indeed, y(r) = 0 if and only if r(a) = 0 
which occurs if and only if g|r by Proposition [7.5.1] Hence y(ri) = y(r2) 
if and only if q|(r1 — r2) which holds if and only if r1 = r2 (mod q). So we 
may define a map ¢# : Z,[X]/(q) > F by 


A([r]) = r(@). 


The value of ¢((r]) is independent of choice of representative r, so ~ is well 
defined on equivalence classes mod q. Therefore this definition makes sense. 
Moreover, our calculation also shows that ¢ is one to one. Both sets have 
cardinality p’. Thus the map ¢ is a bijection of Z,[X]/(q) onto Z,[al. 

Next notice that ¢ preserves addition and multiplication because y does. 
In other words, 


P(lr]) + ([s]) = v(r) + v(s) = v(r + 8) = G([r + 5)]). 
and 
P(lr}) A([s]) = v(r)¢(s) = (rs) = G([rs}). 


So ¢ is a bijection between Z,|X|/(q) and Z,[a] which preserves all the field 
operations, i.e., it is an isomorphism. | 


Since the cardinality of Z,[a] is at most p’, we obtain the following 
consequence. 


7.5.3. Corollary. Let a be an element of a field F of cardinality p* with 
minimal polynomial q € Zp[X]. Then degq < d and Z,|a] = F if and only 
if deg q = d. 


7.5.4. Example. Consider the field F = Z3|x]/(x* + x? + 2), and the 
element a = [x7 + x + 2]. Compute a? = [x? + 2x + 2] and a® = [2? + a]. 
So we observe that a is a root of q(X) = X3+2X +2. This is irreducible 
because it is a cubic with no roots in Z3. To compute a~! in F, we notice 
that 


(=6"(e' 426042) S¢t9a¢"* 
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Thus a~! = a* — 1. Now we may factor g = (X — a)(X?+aX +47) in 
F[|X]. The quadratic factor will have roots 
—at Va? — 4a7! —at/a?—-l1(a2-1) 2a+2 


= ——. = gels 
2 2 2 


So 
q(X) = (X —a)(X —a-—1)(X —a-— 2). 
3 


We also might notice that since a? = a+1, that a? has the same minimal 


polynomial as a. Also 
a? = (a®)? = (a +1)? = 0? + 8a? + 8a +1 = a? +1=44+2. 
Hence a® is the third root. Finally, notice that 
a" = (a+ 2)? =a? + Ga? + 186 +8 = a9 +2 =a. 


This is foreshadowing of a general phenomenon that we will study in the 
section on automorphisms. 


Exercises 


1. In the field F = Zo[2|/(x* +2 +1), find the minimal polynomial of the 
element a = [z? + 1]. 
HINT: compute the first four powers of a and find a linear relationship 
among {1,a,a”,a°,a*}. 


2. What is the cardinality of the subfield Z2[b] C F in the previous exercise 
for b = [x? + a]. 


3. In the field F = Zi9[x]/(x? — 2), show that every element of F \ Zig has 
a minimal polynomial of degree 2. 


4. Use Section[7.4]Exercise]to show that if a € F with minimal polynomial 
q(x) € Zp[x| and |F| = p*, then deg q divides d. 


7.6. Finite Fields 


We will see now that all finite fields of characteristic p arise from arith- 
metic modulo an irreducible polynomial over Zp. To get finer detail about 
the structure of F, we will need to know about primitive roots. Recall 
that a primitive root of F is a unit a such that the set of powers of a, 
{a,a’, east Yn is the full set of units F*. In particular, primitive roots 
are generators of F. 


7.6.1. Theorem. Every finite field has a primitive root. 


Proof. Again, the proof is the same as for Zp. Let n = |F| = p?. The 
order of a unit a is defined to be the least positive integer d = ord(a) such 
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that a4 = 1. As in Proposition 2.10.2] it follows that if a* = 1 = a‘, then 
ake) — 1 as well. Since a”! = 1, it then follows that ord(a)|n — 1 as in 
Corollary 

Following the proof of Lemma [2.10.5] let f(d) count the number of el- 
ements a € F with ord(a) = d for each d which divides n — 1. As before, 
notice that ord(a)|d if and only if a is a root of X¢—1 in F. This polynomial 
has at most d roots. On the other hand, 


XM 1p = (XP -1)(Xe G4 Xr hd X44 1) 


has exactly n — 1 roots by Corollary The second factor has at most 
n—1-—droots. Thus each factor must have its full complement of roots. 
This yields the formula 

Sse) =a 


eld 


As in the proof of Theorem [2.10.6] observe that this set of equations is 
also satisfied by the Euler y function. So as in that proof, we deduce that 
f(d) = v(d) for every divisor d of n — 1. In particular, there are y(n — 1) 
elements of order n — 1. These are the primitive roots. | 


We can use primitive roots to provide a familiar criterion for when an 
element of F is a square. 


7.6.2. Proposition. Let p be an odd prime, and let F be a field of cardi- 
nality p’. An element a € F is a square in F if and only if gP*-D/2 = 1, 


Proof. Let ¢ = a*-)/2. Then by Fermat’s little theorem for F, c? = 
a?"-! = 1. Thus c is a root of x? — 1 = 0; whence c € {221}: 

Let 6 be a primitive root for F. Then p*-1)/2 — —1 since it is distinct 
from bP*-! = 1. Ifa =D for0<k< p? — 1, then 


qP*-1)/2 — (pe*-V/2)F = (-1)*. 


This equals 1 if and only if & is even. 

If a = d@? for sme d € F andd =D! for 0 Pet 1, then a = b#!. So 
k = 21 (mod p? — 1). Since 21 and p4 — 1 are both even, this forces k to be 
even. Conversely, if k = 21, then a = (b')? is a square. | 


Now we have the necessary tools to prove the main theorem about finite 
fields. 


7.6.3. Theorem. Let F be a finite field of cardinality p”. There is an irre- 
ducible polynomial q € Zp|x| of degree n so that F is isomorphic to Z,|x]/(q). 
Moreover, X°" —X = |] ,¢p(X —a) factors into linear terms with p* distinct 
roots. 


7.6. FINITE FIELDS 163 


Proof. Let a be a primitive root of F. Let q be the minimal polynomial of 
a. The subfield Z,[a] contains a® for 0< k < p4—1. As this is a list of all 
the non-zero elements of F, we obtain Z,|a] = F. By Theorem [7.5.2] there 
is an isomorphism of Z,[X]/(q) onto F. Since 


pre = |Z,[X]/(q)| = |F| =p", 


we see that gq has degree exactly n. 
By Corollary [73.3] X°" — X =[],<p(X — a) in F[X]. This is degree p” 
and has p” distinct roots. a 


Since there are different irreducible factors of degree d, it is possible that 
there are many different finite fields of each cardinality. However, this is not 
the case. 


7.6.4. Corollary. There is only one field F of cardinality p" up to iso- 
morphism. 


Proof. Suppose that F and G are finite fields of cardinality p”. By Theorem 
there is an irreducible polynomial qg of degree n so that F is isomorphic 
to Zp[X]/(q). Moreover, q is a factor of X”" —X, so we obtain a factorization 
XP" — X = q(X)r(X). 

By Corollary [7.3.3] applied to G, we see that X”" — X factors into linear 
terms in G[X]. As we have seen before, the polynomial g(X) must have 
exactly n roots in G. Let 6 be such a root. Then the minimal polynomial 
of b in Z,[X] is q since q is irreducible. By Theorem Zyp[X]/(q) is 
isomorphic to Z,[b]. In particular, Z,[b] has p” elements, and thus is all of 
G. So F and G are isomorphic are both isomorphic to Zp|X]/(q), and thus 
to each other. a 


Because of this corollary, there is at most one field of cardinality p” for 
each prime p and positive integer n. We will call it Fp». We still need to 
show that Fp always exists. 


7.6.5. Corollary. Every irreducible polynomial of degree n in Zp[x] splits 
into a product of linear terms in Fyn. 


Proof. This is an immediate corollary of Corollary [7.3.4Jand the uniqueness 
of F,n established above. |_| 


7.6.6. Example. Consider p(x) = z+ + 2? +241 in Zs[z]. This is 
irreducible. To see this, first notice that it has no roots in Z3. So if it factors, 
it is into a product of two quadratics. There are only three irreducible 
quadratics in Z3[x], namely x? +1, x? —x—1 and 2?+2—1. None of these 
divide p, so p is irreducible. Form the field F = Z3|2]/(p) with 81 elements. 
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To find a primitive root, we require an element of order 80. As for prime 
integers, it suffices to show that ord(a) is not a proper divisor of 80 = 245. 
Thus an element a such that a*° 4 1 and a!® 4 1 must be a primitive 
root. Using computer software, we compute [7]*? = 1. A second try is 
[x + 1]*° = —1 and [x + 1]'6 = [z? — 1]. Thus [x + 1] is a primitive root. 
Going back to the element [xz], we compute [x]?° = [—x? — 2? — x + 1] and 
[x]® = [x° + x? — a]. So ord([x]) = 40. 


Exercises 


1. Check by division that p(a) in Example [7.6.6] is not divisible by any 
irreducible quadratic polynomial, as claimed. 


2. (a) Factor X!° — X into irreducibles in Z2[X]. 
(b) Show that X° + X + 1 is irreducible over F = Ze[x]/(p(x)) where 
pz) =a2*+a2+a%74+e24+1. 
3. Show that for any polynomial g € Z,[x] (where p is prime), the polyno- 
mial ge — q is divisible by xP — 2. 
HINT: consider its roots in Fya. 
4, (a) Find ord([z]) in F = Zo[a]/(x* + 23 + 2? +2+41). Notice that [2] is 
a generator of F but not a primitive root. 
(b) Find a primitive root for F. 
(c) Factor X4+ X34 1 in FLX]. 


5. Ifp 3 is prime, find a criterion for a € Fyn to be a perfect cube. 


7.7. Automorphisms of F 


pe 

In the study of fields, the set of isomorphisms of the field onto itself (which 
are called automorphisms) is very important. It is a crucial idea of Galois 
theory. Galois theory can be used to explain why certain polynomials of de- 
gree at least 5 cannot be solved by repeated kth roots, k > 2. It is also used 
to show that certain angles cannot be trisected by a procedure using only 
a straight-edge and a compass. In the case of finite fields, we may analyze 
these automorphisms more concretely. The key is the following observa- 
tion showing that there is a special automorphism called the Frobenius 
automorphism for each finite field. 


7.7.1. Lemma. Let Fa be a finite field. The map y: F,a + F,a given by 


ya) =a? 


is an isomorphism. Moreover, y(a) = a if and only if a € Zp. 
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Proof. We see y(0) = 0 and y(1) = 1. Also, 


p(ab) = (ab)? = aPb? = v(a)y(d). 
So y is multiplicative. The key is that it is also additive. 
! 
1<i<_p, then . Se 
i il(p — i)! 
numerator but not the denominator. Thus because computations in F are 
done modulo p, 


Note that if 


is a multiple of p because p divides the 


yp(a+b) = (a+b)? = S- (*) aibP-i 
i=0 
=a? + = (a) + y(b). 


Hence we see that y preserves all the field operations. Next let us check 
that y is a bijection. If 


0 = y(a) — v(b) = y(a—b) = (a—b), 
then it follows that a—b=0 or a= b. So 9 is a one-to-one map of F into 
itself. As F is finite, it is also onto. Hence y is a bijection. Therefore it is 
an automorphism. 
Finally, notice that y(a) = a if and only if a is a root of X? — X. By 
Fermat’s little theorem, every element of Z, is a root. This accounts for p 
roots of this polynomial of degree p. Hence there are no others. a 


7.7.2. Example. Consider the field of 8 elements Fg. The Frobenius au- 
tomorphism is y(a) = a?. So y?(a) = y(y(a)) = a4 is also an automorphism 
of Fg. Similarly, y?(a) = a® is an automorphism. However, by Fermat's lit- 
tle theorem for finite fields, a? = a for every a € F. So gy? is the identity 
map. Using the multiplication table we can construct the following 
table. 


a g@) | #@) | ¢@ 

0 0 0 0 

1 1 1 1 

a x e+e x 
r+1 e?+1 2p] r+1 

x e+e x x 
| g?+e+1 r+1 g7+1 
ve+e x x vr+n 

g+aetl xr+l g7+1 g+at+l 


FIGURE 7.7.1. Automorphisms of Fg 


Recall that we showed that in Fg[.X], we can factor 


Mitkas (Kame ae x = 9). 
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Observe that x, p(x) = x? and y(x?) = x? + x are the three roots of this 
polynomial. Also y(a? + 2) = x; so y just permutes the roots. 


This demonstrates a useful property of automorphisms of F for the pur- 
pose of studying polynomials in Z,|X]. Every automorphism of F must 
permute the roots of these polynomials in Zp[X]. 


7.7.3. Lemma. Let 7) be an automorphism of F,a. Then ~(a) =a for all 
ae Zp. If q € Zy|X] anda € F,a is a root of q, then (a) ts also a root of 
qd: 


Proof. First, since y(1) = 1, we have 
k)=WO+...4+1 
v(k) =y0+...+1) 


k terms 
= (1) +...+v(1) 
H+ 
k terms 
=1+...+1=k. 
eS 
k terms 


This shows that ~ is the identity on Zp. 
Now let g(X) = qo+q@X +...+@,X” be a polynomial with coefficients 
qi € Zp. If a is a root, then 


0 = d(a(a)) = >> v(a)v(a') 
1=0 


=> wa)! = av(a)). 
1=0 


So v(a) is also a root. Indeed, applying w to all the roots of q yields a 
permutation of the roots. | 


7.7.4. Corollary. Let a be a primitive root of F,a, and let q € Zp[X] 


be its minimal polynomial. Then q has d distinct roots: y*(a) = a?” for 
O0<k<d-1, where y is the Frobenius automorphism. 


Proof. By the previous lemma, since a is a root of q, then so are 
2 3 
a = y(a) =a", ay = 9(a) = (a) =a”, a3 = Y(az) =a", 
and so on. Indeed, each ay = y*(a) = a?* must be a root of q for all k > 0. 
For 0 < k < d—1, these are all different roots because a is a primitive root. 
This accounts for all d roots of g. Of course, by Fermat’s little theorem for 


finite fields, p4(a) = a?* =a. So the sequence starts repeating itself at that 
point. a 
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7.7.5. Lemma. Let Fa be a finite field, and let a be a generator of Fa. If 
1 and wW2 are automorphisms of F such that (a) = wWe(a), then Wy = Yo. 


Proof. Since y; are isomorphisms, it follows that 


vr(r(a)) = r(i(a)) = r(be(@)) = da(r(a)) 
for every polynomial r € Z,[X]. Since a is a generator, this accounts for 


every non-zero element of F. So wy, = qo. | 


This brings us to the main theorem of this section. 


7.7.6. Theorem. Let F,a be a finite field, and let p be the Frobenius 


automorphism. Then d is the smallest positive integer k such that y* = id. 
Moreover, the set of all automorphisms of Fa is given by 


{id OO sesge™ “hi 


Proof. Notice that y*(a) = a®*. Hence the fixed point set 
{a € Fya : y* (a) =a} 
consists of the roots of the polynomial XP’ — X. Fork < d, this is a proper 


subset of Fa because the polynomial has at most p® roots. So y* # id. But 


every element of Fa is a root of xe by Fermat’s little theorem for 
finite fields. Thus y? = id. 

Let 7% be any automorphism of Fa. Fix a primitive root a in F,a, and 
let q be its minimal polynomial in Z,[X]. By Lemma|[7.7.3] ~(a) is another 
root of g. And by Corollary [7.7.4] there is an integer k so that w(a) = y*(a). 
By Lemma[Z.7.5) y = vy". Therefore every automorphism of Fa is a power 
of the Frobenius automorphism. | 


Exercises 


1. Let F = Zs[2]/(x+ + 2? +241). Show that q(X) = X44+ X74 X41 
factors as 


q(X) = (X — 2)(X — 2°)(X — 2)(X — 21°), 


2. With F as above, use the fact that x? +1 is a root of the irreducible 
polynomial X4 — 2X3 — 2X? 4 2X +2 to find the other roots. 


Show that every a € Fyn has a unique pth root. 


4. Let p be prime and n € N. Show that n divides the Euler number 
y(p" — 1). 
HINT: this is the number of primitive roots. Show that they split into 
disjoint subsets S, = {y*(a) : 0 < k <n} of size n for primitive roots a. 
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5. (a) For any divisor d of n, show that the roots of X?" — X in Fyn form 
a subfield isomorphic to Fya. 
HINT: Use the fact that y? is an automorphism to show that this 
set of roots forms a field. 
(b) Deduce that this is the unique subfield of cardinality p?. 


(c) Show that every automorphism of F,» maps this subfield onto itself. 


6. If aeF*n, its conjugates are {y*(a) : 0<k<n}={a=a1,a2,..., ag}. 
Let q be the minimal polynomial for a. 
(a) Show that the the conjugates of a are roots of q. 
(b) Show that the polynomial p(x) = TI¢_1(2—-ai) has coefficients which 
are fixed by y. 
(c) Deduce that p = q. So the roots of q are exactly the conjugates of 


a. 
(d) Show that d|n. 
HINT: the smallest e > 0 such that y°(a) = a divides n. 


7. Define the trace on Fyn by Tr(a) = ye. 0 Y* (a). 
) Show that Tr(a) € Fp. 
Show that Tr(a + b) = Tr(a) + Tr(b) for a,b € Fyn 
Show that Tr(8a) = 6 Tr(a) for 8 € F, and a € Fyn 
(8 
( 


es 


a 
b 
c 
d 


YS 


Show that Tr(() = “a for 6B € Fy. 


( 
( 
(e) Show that Tr(a?) = Tr(a) for a € For 


NS 


e 


7.8. Irreducible polynomials of all degrees 


We have made the implicit assumption in the preceding discusion that irre- 
ducible polynomials exist in abundance. In this section, we will show that 
there are irreducible polynomials in Z,[X] of every degree for every prime 
p. First let us take note of something we already know. 


7.8.1. Lemma. Let ¢ € Z,|X] be an irreducible polynomial of degree d. 
Then q is a factor of IP aX: 


Proof. Form the field Z,[X]/(q). This has p4 elements, and the element 
[z] is a root of qg. Since q is irreducible, it is the minimal polynomial of [z]. 
By Fermat's little theorem, [z] is a root of X?° — X. Therefore, q divides 
xP — X. 7 


A converse of sorts requires some more sophisticated argument. First 
we need an elementary, yet rather clever, calculation. 


7.8.2. Lemma. gcd(X™ — 1,X"—1) = X4—1 where d= gcd(m,n). 
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Proof. If m = dk, then 
aS a TON a te). 


Thus X¢—1 divides both X™—1 and X"—1. By the Euclidean algorithm, 
there are positive integers s and t so that d= ms — nt. So if we define 


S(AyHt ea 4 ee 
T(X) = (1+ X" 4X7 4...4 XE) x4 


Then 
OP 6X) = (= TY) Si a 1) oO a1 X= x? S11, 


So any common divisor of X™—1 and X” —1 divides X¢—1. Thus the gcd 
of X™ —1 and X” — 1 equals X4— 1. | 


7.8.3. Corollary. gcd(p™ — 1,p" — 1) = p?—1 where d= gcd(m,n). 


Proof. Substituting p for X shows that p* — 1 divides both p™ — 1 and 
p" —1. The proof of the previous lemma shows that gcd(p™ — 1, p” — 1) 
divides 

(p™ — 1)S(p) — (p” — 1) Tp) = p* - 1. 
Hence ged(p™ — 1, p" — 1) = p?— 1. | 


7.8.4. Lemma. Let ¢ € Z,|X] be an irreducible polynomial of degree d. 
Then q is a factor of X?" — X if and only if d\n. 


Proof. The case q = X is trivial, so suppose that q # X. 

Suppose that d|n. Then by Lemma [7.8.1] we have q|XP*-1 —1. By 
Corollary p¢ —1 divides p” — 1. Hence by Lemma[7.8.2] X?°-! —1 
divides XP"~' — 1. So q divides XP" — X which divides XP" — X. 

Conversely, suppose that q divides X?" — X. Since X?" — X factors 
into linear terms in Fyn, so does gq. Let a € Fyn be a root of g. Since g is 
irreducible, this is the minimal polynomial of a. Hence Z,|a] is isomorphic 
to Zp[a]/(q), which has cardinality p*. By Corollary [7.3.3] , 


I] x-0=x" -x. 
bEZ> [a] 
This divides XP" — X in F,n[X]. Because both have coefficients in Zp, the 
quotient also lies in Z,[X]. So XP"-1_] divides XP"—!-1. By Lemmal[7.8.2] 
p? — 1 divides p” — 1. And by Corollary [7.8.3] d divides n. | 


Our next goal is to show that if g € Z,[X] is an irreducible polynomial 
of degree d and d|n, then q? does not divide X”" — X. To prove this, we 
need a method to identify repeated roots. The key tool we use is the formal 
derivative. 
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7.8.5. Definition. Let F any field and let q(x) = a qx’ be an element 
of F(x]. Then, its formal derivative is given by 


d 
q'(x) = Ss" ign’ 1. 
=A 


7.8.6. Lemma. For a polynomial q € F[X], all irreducible factors of q are 
simple if and only if gcd(q,q') = 1. Moreover, if there are repeated roots, 
this gcd provides a proper factor except when F has characteristic p and q is 
a perfect p-th power. In either case, this yields a factorization of q. 


Proof. In Exercise [3] the reader will verify the product rule 
(qr) =qdr+qr. 


If g has a repeated factor u, then we can write q = u?v for some v € FLX]. 
Calculate 


qd = (uv)! = 2uu'v + u?0! = u(2u’ + w’) 


Hence wu divides gcd(q, q’). 

This gcd provides a proper factor of gq except in the special case in which 
gcd(q,q’) = q. But since deg(q’) < deg(q), this can only occur when q’ = 0. 
This can never happen over the rationals, or any field of characteristic 0. 
However, in a field of characteristic p, this can happen if ig; = 0 (mod p) for 
every coefficient 7. Clearly this means that q; is non-zero only when i = 0 
(mod p). In this case 


m 
q(X) = » ajX?. 
j=0 


Let u = 0 a;XJ. By Lemma[?.7.]] above, the p-th power of a sum 
is the sum of the p-th powers in any field of characteristic p. In particular, 
q=u?. This yields a factorization of q. 

Conversely, suppose that u is an irreducible factor of g which is simple, 
so that gq = uv where v € F[X] satisfies gcd(u, v) = 1. Then 


dq =(uv) =wvt+u' =u'v (mod u) 


Now wu’ # 0 since wu is irreducible (and thus is not a p-th power), and wu’ is 
of lower degree than u. So both wu’ and v are relatively prime to u. By the 
unique factorization theorem, the product w’v is also relatively prime to u. 
Hence u is not a factor of q’. 

Consequently, if g has only simple factors, it can have no factor in com- 
mon with q’. Therefore gcd(q, q’) = 1. | 


We can now describe the factorization of X?” — X into irreducibles in 
Zy|X]. 
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7.8.7. Corollary. X”" — X factors in Z,[X| as the product of all irre- 
ducible polynomials q of degree d as d runs over all divisors of n. 


Proof. Let f(X) =X?" — X. The formal derivative is 
f'(X) =p" XP -1=-1 


and so gcd(f, f’) = 1. Since f(X) is not a perfect p-th power, by Lemma 
all of the irreducible factors of f(X) are simple. The result now follows 
from Lemma [7.8.4] | 


We are finally ready to prove the main result of this section. 


7.8.8. Theorem. There are irreducible polynomials in Z,[X] of degree n 
for every n. 


Proof. Let rq(X) denote the product of all monic irreducible polynomials 
of Zp[X] of degree d. From Corollary [7.8.7] we obtain that 


XP" — X =][ra(X). 
d\n 
Therefore 
p” = deg(X”" — X) = S° deg(ra(X)). 
d\n 

We will show that r, is non-zero by showing that the sum of the degrees 
of the other factors of X?" — X is strictly less than p”. Note that since rq 
divides XP° — X , it has deg(rg) < p*. Thus a crude estimate shows 


n—-1 
P —P 
de fa) = og a <p” 
(II) < Dr's So =ZEP <p 
d|n d|n i=1 
d#én 
So rn must have non-zero degree. a 


7.8.9. Remark. We are able to prove Theorem[?/.8.8] by crudely bounding 
the number of irreducible polynomials of a given degree. In Exercise [7] we 
prove a formula giving the exact number of irreducible polynomials in Z,[z] 
of degree n. It is actually rather large. 


Here are two easy consequences of this theorem. 


7.8.10. Corollary. There is a finite field Fyn of cardinality p" for every 
prime p and integer n > 1. 


7.8.11. Corollary. There are irreducible polynomials of every degree in 
Z|X]| using only 0’s and 1’s as coefficients. 
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Proof. Take an irreducible polynomial of degree n in Zo[X]. Then the 
corresponding polynomial in Z[X] is irreducible by Corollary | 


7.8.12. Example. Consider the polynomial X*! — 1 in Z7[X]. First look 
for the smallest integer d so that X°! — 1 divides X ed = 1. By Lemma 
this occurs precisely when 31 divides 74 — 1; that is, when 74 = 1 
(mod 31). So we are interested in ord31(7). By Fermat’s little theorem, this 
is a divisor of 30. A calculation shows that 


7>=2 (mod 31) and 7?=5 (mod 31). 
Therefore, 7° = 4 (mod 31), 7!° = —6 (mod 31) and 7!° = 1 (mod 31). 
Hence, ord3;(7) = 15. 
So X31 — 1 divides X7"°-! — 1. Since 31 is prime, Lemma [7.8.2] yields 
that 
eek 1X eae aH Seed = 1, Xk 


Consequently, it follows from Lemma[7.8.4|that X?!—1 has one linear factor, 
X — 1, and no irreducible factors of degree 3 or 5. Therefore it must factor 
as the product of X — 1 and two irreducible polynomials p,, p2 of degree 15. 
Symbolic computation software such as MAPLE or MATHEMATICA can find 
these factors easily. In particular, p; equals 


Pa) Gt! C=.) Co Glee e aes a) eee ake ey eee Cle ames ane 


Consider the field F = Z7[2]/(p1) of order 7!°. The element [2] is a 
root of p;, and thus is a root of X°! — 1. Hence ord((z]) divides 31. Since 
[x] A [1] and 31 is prime, we find that ord((z]) = 31. In particular, [2] is not 
a primitive root. However, it is clearly a generator of F. 

Let us try to count the irreducible factors of X 7’ _ X. From the theory 
we have developed, it factors as 


XT — X =1y(X)r3(X)rs(X)ris(X) 
where rq(X) is the product of all monic irreducible factors of degree d. We 
also know that 
r(X)=X'= X(X —1)(X — 2)(X — 3)(X +3)(X +2)(X +1) 
r3(X) = am 1x 1) 
r5(X) = (X7-1 — 1)/(X®- 1) 
rie(X) = (XT — 1)(X8 — 1) /(XPE —1)(XP-1 — 1). 
So we see that there are 7 irreducible polynomials of degree 1. The degree 
of r3 is 7? — 7 = 336. So there are 112 irreducible polynomials of degree 3 
over Z7. Similarly, the degree of rs is 7° — 7. So there are (7° — 7) /5 = 3360 
irreducible polynomials of degree 5 over Z7. Finally, we calculate the degree 


of rj5 to be 7° —7°—7?+7. Dividing by 15 yields the number 316504099520 
irreducible polynomials of degree 15. There are y(7!° — 1) = 1450340640000 
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primitive roots of F. These come in groups of 15 corresponding to the roots 
of 96689376000 of these irreducible polynomials of degree 15. 


Exercises 


Find an irreducible polynomial of degree 6 in Zg|z]. 


2. How many irreducible monic polynomials of degree 6 are there in Z9|z]. 
How many of these have roots which are primitive roots in F¢4? 


3. Verify the product rule for the formal derivative of polynomials in any 
field. 


4. Show that the only subfields of Fy» are the fields Fa for d|n. 
HINT: combine Corollary and Section [7.7] Exercise 


5. Show that the fixed point set of y* on F,,a is the subfield F,e where 
e = gcd(k, d). 


6. In this exercise, we prove the MOébius inversion formula. Let 1: Z* >] 
Z be defined as follows. Let y(1) = 1, u(r) = 0 if n is not square-free, 
and otherwise y(n) = (—1)*, where n is a product of k distinct primes. 


(a) Prove }) a, H(d) is 1 if n = 1, and 0 otherwise. 
(b) For any functions F,G : Z* — Z, let 


(F *G)(n) = 5° F(d)G(4). 


Prove * is an associative commutative binary operation on functions. 
(c) Find the function H which the identity for x. 
(d) Suppose f : Z* + Z and g: Z* > Z are functions, and that 


Prove that 


7. Let p be a prime. Prove that there are 
l n\,d 
a ~ LG)P 
d|n 


monic irreducible degree n polynomials in F,[z]. 
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7.9. Factoring Algorithms for Polynomials 


In this section, we take a brief look at one method for factoring polynomials. 
It turns out that it is much easier to factor a polynomial of degree d in Z,[2] 
than to factor a number with d digits in base p. This seems, on the surface, 
to be a surprising fact because the number and the polynomial have the 
same complexity. However, it turns out that the structure of finite fields is 
the key. 

The first step in factoring polynomials is to reduce the problem to the 
case in which the polynomial q has no repeated factors, which may be done 
usng Lemma [7.8.6] Compute gcd(q, q’) and use this to factor g. Repeat as 
necessary until it is factored into terms with no repeated factors. 

We are now ready to study the main factoring algorithm of this sec- 
tion. It is the preferred method used in the symbolic computation program 
MAPLE. Also, it is perhaps the simplest and most effective way to factor 
polynomials in Z[z]. The main idea is to factor polynomials modulo p based 
on the Euclidean algorithm and Lemmal[7.8.4] Then Hensel’s Lemma, which 
will be discussed in the next section, is used to increase the information 
about the possible integer factorizations. 

Lemma shows that if g € Z,[2] is an irreducible polynomial of 
degree d, then q divides x?" — x but does not divide 2? — x for k < d. 
We first compute ged(q,x? — x) = r1. Since 24-2 = THoez, * —a, this 
will produce a factor r1; which we will later factor into a product of linear 
terms. Replace q with q = q/ri1. Next compute ecd(q,, 2?” —2£) = 1. 
Since rg has no linear factors, all of its irreducible factors will be quadratic. 
Set go = qi /r1 and define gcd(qo, oP — x) =r3. Then all of the irreducible 
factors of r3 have degree 3. Proceed until the degree of q is reached (although 
this will end sooner if factors are found). For this reason, this method is 
known as the distinct degree algorithm. 

Now, these factors can be distinguished by using quadratic residues in 
finite fields. When p ¥ 2, half of the non-zero elements of a finite field are 
perfect squares. So a polynomial t¢ of degree at most d— 1 will be a square 
modulo r; about half the time. When ¢ is a square in Z[z]/(r;), then by 
Proposition 

plp*-1)/2 = 1 (mod r;). 


And when ¢ is not a square, 
pea SS (mod rj). 
So it suffices to compute 
gcd(r, tP-D/2 _ 1) 


for several random choices of t to obtain various proper factors of r. 

We won’t work out exactly what happens when p = 2. Let f(x) = 
3 z*. Compute ged(r, f ot) for random choices of t € Zo[x] of degree 
less than d. 
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7.9.1. Example. We demonstrate this algorithm via an explicit example. 
Consider the polynomial q(x) in Zs5/x] given by 

g(x) = x!9 + 3a!8 4 ol? 4 gl — gl 4 ol 4 gl? 4 gl — 2910 


hg = 9G? = Dg Ah Oa hg a ae De 


First a computation shows that gcd(q, q’) = 1. Then we compute 
gcd(q,2° — #) = 274+ 3¢4+2=(¢+1)\(4+2). 
Factoring this out leaves q, = q/(x? + 3x + 2). Continue 
gcd(qi, 27° — x) = 1 
showing that there are no quadratic factors. Then 


gcd(qi, 1? — 2) = 29 + 227 — 26 — Ont — 23 +2? +1. 


This must be the product of three irreducible polynomials of degree 3. Since 
53-1 


5 = 62, we compute 


gcd(x? + 2a" — 2° — at — 3 + o* +:1,0°7 — 1) =o? + Qe - 1. 
So 
a? + 20° — © — 2a — oe? + o*@ +1 = (2? + 22 — 1)(2® — 2e - 1). 


And 


gcd(g? =O —1, (GE) = 1S o° Seo. 
Hence 


a? + 2a" — 9° —Qe4 — a3 49? 41 = (2? +20 —1) (2? — 2? —2)(2? +27 4+2—-2). 


The remaining term is 
g3 = 1 /(2? + 2a" — 2° — 204 — x? +2? +1) 
= 2° 4+97°—1 


This is either irreducible, or the product of two irreducible factors of degree 
4. We try 


ged(qz, 2° — x) = qo 
So qo is a product of quartics 
ged(qg, 2°"? — 1) = 1 
gcd(q3, (a + 1)?!? —1) = 24447-2242 
Thus 
a® + 24° — 1 = (o* + 2? — Qn 4 2)(x* + x? + Qn 4+ 2). 
This provides a complete factorization of g(a) =: 
(a + 1)(@ + 2)(@? + 2a — 1)(a? — 2? — 2)(9? + 2? + & — 2) 
x (et +a? — 22 + 2)(24 + 2? + Qe +2). 
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Exercises 


1. Use the distinct degree algorithm to factor q € Z7[x] given by 
ai? + 3a! + 301 + 9 + 20° + +60 + 6° + ot + 3a3 + 2? + 4 + 3. 


2. Use the distinct degree algorithm to factor the polynomial q € Zs5[z] 

given by 

q(x) = 08 +0" + 30° + 20° + 4r4 + 403 + 80 4+ 4. 
3. Factor in Z3[z]: 
q(x) =a tata? 28-28 4a? 41. 

A, Let f(x) = Se a2’ € Zo[z]. Show that f(f(z) +1) = 2" —1. 

Factor in Zg|z] the polynomial q(x) = 

Pall Gates gate awe ae are ae OVO sn oes Gar ae aie eee 


Remember to check for repeated factors. 


7.10. Factoring Rational Polynomials 


Now let us reconsider the problem of factoring polynomials with integer coef- 
ficients. This can now be done in a routine algorithmic way. The first step is 
to pick a prime p relatively prime to the leading coefficient of the polynomial 
q. For a computer, a good choice is reasonably large but still manageable 
for exact integer arithmetic. (MAPLE picks one near 10+.) Then use the 
distinct degree algorithm to factor g(a) (mod p). Finally, use an algorithm 
we explain in this section known as Hensel’s Lemma to recursively improve 
this factorization mod p to a factorization mod p* until p® is large enough 
to bound the coefficients of the factors. This either yields a factorization or 
shows that one does not exist. 

This kind of search can be carried out efficiently on a computer. More- 
over, it is not very difficult to get crude bounds on the size of the coefficients 
of possible factors. For example, if g = ee qx’ is a factor of p = eee, px", 


then 
d n 
1/2 
es dil < a(S ipil?) 
i=0 i=0 


We will not prove such an estimate here. However, it means that normally 
only a few applications of Hensel’s Lemma will do the job. Once & is suf- 
ficiently large, we either find an integer factorization or realize that none 
exists. 

We make two simplifying assumptions that are easily achieved. Choose 
the prime p so that it is relatively prime to the leading coefficient of our given 
polynomial g € Z[z]. Also, assume that q factors in Z,|x] into a product wv 
where w and v are relatively prime. Of course, if q is irreducible in Z,|[z], 
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then q is irreducible in Z[x] by Corollary and thus in Q[z] by Gauss’s 
Lemma [6.3.3 


7.10.1 Hensel’s Lemma. Suppose that q(x) = Ee qx’ factors as q= 
uv (mod p). Furthermore, assume that gcd(qa,p) = 1 and that u and v are 
relatively prime in Zp|x]. Then there is an algorithm to calculate polynomials 
uz and vz in Z[x] so that 


q = ugvp (mod p*) 


with deg(uz) = deg(u) and deg(vz,) = deg(v). 


Proof. When the leading coefficient of qg isn’t 1, there is a slight problem 
because the leading coefficients of the factors u and v aren’t determined. 
However, they must be divisors of gg. So a simple trick deals with this 
problem. Replace q(x) by qaq(x) and multiply u and v by the appropriate 
factor so that their leading coefficient is also qg. Since the identity q = uv is 
only mod p, this adjustment can be made mod p, and then fixed up in the 
integers by adding some multiple of px”. 

We will also assume that m = deg(u) < deg(v) = n. By hypothesis, 
gcd(u,v) = 1 in Z,[z]. Thus by the Euclidean algorithm there are polyno- 
mials s and ¢ in Z[z] so that 


su+tv=1 (mod p). 


Let wi = u and v = v and define r1 = (q — u1v1)/p which has integer 
coefficients by hypothesis. In fact, we only need r; (mod p). Now find 
integer polynomials s; and ¢; so that 


$ju, + tv, =7r, (mod p) 


such that deg(s,) < n and deg(t,) < m. To obtain this, notice that 


(ris)u, + (rit)v, = ri(su+tv) =r, (mod p). 


Divide wu; into r;t to obtain quotient a, and remainder t; with deg(t,) < m. 
Set s1 = 71s +111 (mod p). We see that (s1,¢1) is a solution with control 
on deg(t;). The point is that t,v; is a polynomial of degree 


deg(t1) + deg(v1) <m-+n. 


By the identity s;u; = 1— 1 v, (mod p), we see that the same is true for 
8,u,. Since s; was reduced mod p, we know that it has a leading coefficient 
relatively prime to p. Thus its degree is the same as its degree mod p. So, 


deg(si) = deg(s1u1) — deg(u1) <m+n-—m=n., 
Now we are ready to improve the factorization. Set 


U2 ‘= uy + ply V2 i= Vy + psy. 
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This does not affect the leading coefficients of the u’s or v’s. Then it is a 
simple exercise to verify 


q — Ugv2 = (wiv, + pri) — (uivy + p(siui + tv1) + p*sit1) 


= p(r1 — s1u, — tv1) 4 p’st; =0 (mod p’). 
This procedure repeats recursively. Indeed, if 
g = upv_, (mod p*), 


define rz = (q — upvp)/p”. As above, set tz, to be the remainder on dividing 
ryt by uz with quotient az. Then set sp = rps + agvg (mod p). The new 
approximation is given by 

Ub+1 = UR + pty Ubt 2= Ue + esp. 
Then 


d — Uk41VUK41 = (UkUE + p* rp) _ (UkvE + p* (spur + thUp) + p”* siti) 
aa F 


= p* (rp — spur — the) +p sete =O (mod p 


Since this is accurate modulo p**!, reduce the coefficients mod p**+! sym- 
metrically about 0 so that the coefficients have modulus at most p**+1/2. 
Repeating this procedure increases the ‘accuracy’ of the factorization by 
a factor of p at each stage. Moreover, every stage is a routine calculation. 
The most complicated step, the Euclidean algorithm, is executed only once. 
On a computer, this procedure is very efficient. a 


7.10.2. Example. Let us work through an example. The calculations 
were done by a computer, although with such a small example, it is almost 
practical to do it by hand. Let 


q(x) = 62" + 532° — 1742° + 30024 — 3323 — 2932? + 453x — 81. 


Suppose that we found the factorization 


gq = (a? + 2a? 4+ Qe + 2) (xt + 2? + 207 +2542) (mod 5). 
Following our algorithm, we replace q by Q = 6q (mod 5), and set 
uy, = 6u= 6x° + 2274+ 2r+2 (mod 5) 
vy = b6v= 6x* + 1x? + 2074+ 2r+2 (mod 5) 


By the Euclidean algorithm, solve su; + tv; = 1 (mod 5): 
$==2° 27 43 t= x7 + 2Qz. 
The first step is to compute the remainder 
ry = (Q—u1)/5 
= 427° — 252° + 288a* — 126x° — 4382? + 4862 — 126 
= 27° + 32° + 8044 40° 4+ 207 +244 (mod 5) 
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Then dividing tr; by uy mod 5 yields remainder ¢t; = x + 2 and quotient a, 
which is used to compute 


$1 = sry + a1 v1 = 22° + 32? (mod 5). 
Then we obtain 
u = uwt+dt = 62? + 122? — 824 —3 (mod 25) 
vo = wy+5s, = b6a*— 92? 4 2x? +1274 12 (mod 25) 


Check the remainder 
ro = (Q — ugvg)/25 = 22° + 42° +044 323 + 327 44r+2 (mod 5). 


This isn’t 0, so continue on. We get t2 = 4%? + 2 is the remainder of trz on 
dividing by ug mod 5 to get quotient ag. And 


89 = sry + agvo = 347 +32+1 (mod 5). 
Then the next approximants are 


u2 + 25te 
v2 + 2582 


6a? — 1327 +17 —3 (mod 125) 
6a* — 592° — 48x24 122 +37 (mod 125) 


U3 
U3 


This time the remainder is 


r3 = (Q — ugv3)/125 = x® + Qx° + Qat + 3n3 + Qa? + Qa +2 (mod 5). 


This still isn’t 0, so continue on. We get tz; = 0 is the remainder of tr3 on 
dividing by uz mod 5 to get quotient a3. And 


83 = sr3 +.a3v3 =22+1 (mod 5). 
Then the next approximants are 


ug + 125t3 
v3 + 12583 


62° — 1327+ 174-3 (mod 625) 
6a* + 662° — 48x72 + 127 +162 (mod 625) 


U4 
U4 


This time we have found the factorization 


Q = (6a? — 1327 + 17x — 3)(6x* + 66x? — 48x? + 127 + 162) 


whence 


q = (62° — 1347 + 172 — 3)(2* + 112? — 8x? + 2a + 27). 
Exercises 


1. Using computer software, follow the above procedure with p = 7 to 
factor 


q(x) = 6a" + 432° — 3632° — 30124 + 527x? — 152? — 3874 + 76. 


2. Factor in Z[z] the polynomial q(x) = 214 +31213 — 221 — 63x19 + 3129 + 
2728 + 897x7 + 33° + 40° + 54a4 + 3x3 — 580? + 277 4+ 1 given that 


q(x) = (2? + 2° + 227 + 8a +1)(2' +324+22-—2+1) (mod 5). 
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Notes on Chapter 7 


It was Galois who first realized that Z,|[x]/(q) formed a field for any ir- 
reducible polynomial g. He introduced the idea of adjoining a root of a 
polynomial to build a larger field. Dedekind was the first to suggest that 
there should be a general definition of field, although for him, a field was 
always a subset of C. This was the beginning of a deeper understanding of 
the relationship between algebra and number theory. Kronecker’s work was 
very influential: he allowed for a more abstract extension of a field by roots 
of polynomials. E.H. Moore classified finite fields in 1893. 

Dedekind and Weber constructed certain fields of analytic functions as- 
sociated to Riemann surfaces, and Hensel constructed the field of p-adic 
numbers. These very different types of fields paved the way to a general 
abstract definition of fields due to Steinitz in 1910. Many general theo- 
rems in field theory and Galois theory were proven by Weber and Steinitz. 
Subsequent work of Emil Artin modernized the treatment of Galois theory. 

See Kleiner’s short monograph for a brief history of modern algebra. 
Emil Artin’s book [4] on Galois theory is a nice introduction to field theory. 
See Lidl and Niederreiter for more detailed information about finite 
fields. Michael Artin’s comprehensive book on algebra [5}| covers field theory 
including a chapter on quadratic number fields. 

The discovery of algorithms for the factorization of polynomials is more 
recent. See Knuth Section 4.6.2] for an overview of various methods. 
Berlekamp [6] found the first general algorithm for factoring polynomials in 
Z,|x] by reducing the problem to a large system of linear equations. The 
method discussed in these notes is due to D. Cantor and H. Zassenhaus [7| 
in 1981. Hensel’s lemma dates back to 1904 in the same paper in which he 
introduced p-adic number fields. 
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